attackgoat / screen-13

Screen 13 is an easy-to-use Vulkan rendering engine in the spirit of QBasic.
Apache License 2.0
264 stars 13 forks source link

Using screen-13 without event loop. #48

Closed DoeringChristian closed 1 year ago

DoeringChristian commented 1 year ago

Hi, I want to use screen-13 to ray trace into an image and not necessarily display it every frame. I tried to write a basic compute shader that reads from one buffer and writes into another.

    let sc13 = EventLoop::new().debug(true).build().unwrap();
    let mut cache = LazyPool::new(&sc13.device);

    let spv = inline_spirv::inline_spirv! {
        r#"
#version 450

layout(set = 0, binding = 0)buffer In{
    float i[];
};
layout(set = 0, binding = 1)buffer Out{
    float o[];
};

void main(){
    o[int(gl_GlobalInvocationID.x)] = i[int(gl_GlobalInvocationID.x)];
}
            "#, comp
    }
    .as_slice();

    let cpplinfo = ComputePipelineInfo::new(spv)
        .entry_name("main".into())
        .build();

    let cppl = Arc::new(ComputePipeline::create(&sc13.device, cpplinfo).unwrap());

    let mut rgraph = RenderGraph::new();

    let i = Arc::new(
        Buffer::create_from_slice(
            &sc13.device,
            vk::BufferUsageFlags::STORAGE_BUFFER,
            cast_slice(&[0.0f32, 1., 2.]),
        )
        .unwrap(),
    );
    let o = Arc::new(
        Buffer::create_from_slice(
            &sc13.device,
            vk::BufferUsageFlags::STORAGE_BUFFER,
            cast_slice(&[0.0f32; 3]),
        )
        .unwrap(),
    );

    let i_node = rgraph.bind_node(&i);
    let o_node = rgraph.bind_node(&o);

    rgraph
        .begin_pass("Add 1")
        .bind_pipeline(&cppl)
        .read_descriptor((0, 0), i_node)
        .write_descriptor((0, 1), o_node)
        .record_compute(|compute, _| {
            compute.dispatch(3, 1, 1);
        });

    rgraph.resolve().submit(&mut cache, 0).unwrap();

    let slice: &[f32] = cast_slice(screen_13::prelude::Buffer::mapped_slice(&o));

    println!("{:?}", slice);

However this doesn't seem to work when calling submit on the resolver. When calling the compute shader every frame in an event loop the result is correct. What does it take to enforce the execution of the compute shader? Thanks for your help.

attackgoat commented 1 year ago

The operation is submitted to the device with submit(...) but it hasn't finished execution yet, so mapping the buffer isn't synchronized in any way with the completion of the command buffer.

Internally a command buffer exists in the lazy pool with a fence for this operation. Perhaps it could be exposed or something to help you signal on it. I think the event loop being successful is just the frame delay happening to be long enough for this to finish.

One way to get this to run correctly in the mean time is to force the device to finish all commands prior to mapping the slice:

unsafe { sc13.device.device_wait_idle().unwrap(); }
DoeringChristian commented 1 year ago

Thanks for the quick response. It works now.

attackgoat commented 1 year ago

I think the documentation needs to be improved to explain that submit is asynchronous and results won't be available until things finish. Exposing the fence being used could help, but I feel like that's all the Vulkan complexity that people didn't want to know about. Of course device resources like images and device buffers are synchronized and so those cases are "ready" as soon as you submit. It's only host-mapped buffers which hit this.

One option to improve the code here is to add a submit_and_wait function so we would be clearly waiting for the work to actually finish.

Changes to the mapping function alone wouldn't help because this requires a command buffer to submit synchronization commands or vk::Device::wait_for_fence. I'll ponder improvements in this area.

DoeringChristian commented 1 year ago

I'm still not quite sure about how Vulkan works but maybe it would be possible to "schedule" some buffers for evaluation and all shaders that modify that buffer would be executed while the cpu waits only on these executions. Though this would probably be pretty complicated.

attackgoat commented 1 year ago

Update: I've added two new helper functions to CommandBuffer and exposed it as the return value of Resolver::submit(); you can now easily check for submitted work. See examples/cpu_readback.rs for the details.