vulkano-rs / vulkano

Safe and rich Rust wrapper around the Vulkan API
Apache License 2.0
4.47k stars 435 forks source link

RFC: Plan for handling semaphores and synchronization between command buffers #355

Closed tomaka closed 7 years ago

tomaka commented 7 years ago

This RFC is just a base and is not actually a complete solution. I think it lays a good ground for proper synchronization handling, but the actual mechanisms haven't been designed yet.

On the command buffer side

Whenever a command is added to a command buffer and that command contains a buffer or an image, the variable that represents the buffer or the image is passed by value.

In other words for example in order to add a fill buffer command you'd call .fill_buffer(my_buffer, 0). Notice that we pass my_buffer by value, and not &my_buffer.

But that doesn't mean that we actually give up ownership of our buffer or image. What I call a buffer or an image can also be an Arc<MyBuf> for example. This means that you could call .fill_buffer(my_buffer.clone(), 0) and then continue to use my_buffer afterwards.

The same applies for descriptor sets and framebuffers.

At submission

Vulkano would have these traits:

unsafe trait SubmitPreparation {
    type Out: SubmitPreparationCommit;
    fn prepare(self, ...) -> Result<Self::Out, SomeErr>;     // actual parameters remain to be determined
}

unsafe trait SubmitPreparationCommit {
    fn commit(self);
}

In order to submit a command buffer that contains buffers or images, all the buffers or images must be of types whose mutable references implement SubmitPreparation. For example if the command buffer contains a single command that fills a buffer of type B, then in order to submit it the type &mut B must implement SubmitPreparation.

Before submission, the method prepare is called on all the mutable references to all the buffers and images. The parameters will most likely include the queue on which the command buffer is submitted. The prepare method is the last chance for a buffer or an image to perform verifications. The object that implements SubmitPreparationCommit is expected to keep the buffer and image locked if needed.

Once the submission succeeded, the method commit is called in order to unlock everything. If the lock is destroyed but commit wasn't called, the changes are instead reverted (in practice most of the time commit will apply the changes and the destructor won't do anything). This allows us to be able to recover from failed submissions (cc #351).

Alternatively command buffers that are known to be only submitted once can require buffers and images to directly implement SubmitPreparation, instead of their mutable references. It would be nice to be able to add impl SubmitPreparation for &mut T where T: SubmitPreparation to vulkano, so that buffers/images would only need to implement SubmitPreparation on themselves, but I don't think it's possible.

The SubmitPreparation trait will also be implemented on &mut Arc<T> where &T: SubmitPreparation. Therefore buffers and images are also encouraged to implement this trait on their shared references if possible.

On the buffer/image side

But how do buffers/images actually implement SubmitPreparation?

For the moment just like they already do, by keeping in memory information about last time they were used. Since this information is behind a mutex, the Out associated type will need to contain a MutexGuard that keeps the mutex locked and prevents other threads from simultaneously locking the same variable.

There are two ways this could deadlock:

In the future it may be a good idea to add types similar to RefCell that allow custom synchronization strategies.

So how do you actually handle semaphores/barriers?

Whatever system we decide to plug in this design, it can be done by:

tomaka commented 7 years ago

Usually when you talk about a "buffer" or an "image", people imagine long-lived objects that you keep alive and reuse between frames.

That is technically true, but not necessarily true in your code. For example you could create a pool of buffers, and at the start of each frame you ask the pool to give you a buffer. If none is available a new one is created. At the end of the frame the buffer is returned to the pool. While technically buffers of the pool are long-lived, in your code it looks like each buffer only lasts for one frame and is recreated every time.

By using this kind of design, we can avoid make ownership tracking easier.

For example if you do this:

let pool = BuffersPool::new(&device);

let cb1 = AutoCommandBuffer::new().fill_buffer(pool.alloc(), 0).build.submit();
let cb2 = AutoCommandBuffer::new().fill_buffer(pool.alloc(), 0).build.submit();
let cb3 = AutoCommandBuffer::new().fill_buffer(pool.alloc(), 0).build.submit();

Even though it looks like your code allocates three buffers, in reality it could end up being the same buffer used three times if your pool is able to determine that they can be used concurrently.

The choice should be left to the user whether to manage buffers and images like this or by passing the same buffer every time manually. But the system should allow both.

tomaka commented 7 years ago

So far I have talked about the "administrative" side of things. How the API should be designed. But before committing to a design, it should be decided what vulkano actually does at runtime at the lowest-level.

What can we guarantee exactly?

Guaranteeing some things at compile-time looks possible, but you will always need runtime checks for some usages.

Next to this, we also have two possibilities to handle synchronization:

I think the right way to do is to automatically build things, except when the CPU or GPU overhead would be too large. For example nobody wants to explicitly write out pipeline barriers for the swapchain images at the end of a frame, but you also don't want vulkano to potentially block your queue if it thinks that the only way to guarantee safety is to block your queue.

The biggest problem is semaphores. Whenever you submit a command buffer you can signal semaphores. Later if you want to depend on the command buffer that you submitted, you have to wait on that semaphore. This means that when you submit a command buffer you have to know in advance how many command buffers are later going to depend on it. This is not something that vulkano can know, unless we build in some sort of graph system.

The other problem is resources in exclusive mode. If you use a resource in exclusive mode in queue family A, then later in queue family B, you have to put a pipeline barrier in both queue families A and B. This means that at the moment when we know that we need this pipeline barrier, it is likely that we have already submitted other stuff to queue family A and thus it is really suboptimal to append a command at the end of it just because vulkano wasn't capable of determining it.

I think these two aspects should be explicitly performed by the user, while other aspects (for example pipeline barriers between two command buffers of the same queue) should be done automatically by vulkano.

tomaka commented 7 years ago

We have essentially three aspects to explore:

I think option B could be done by having "marker objects" that you can build from a Submission object. For example:

let my_buffer: MyBuffer = new_buffer();
let cb = AutoCommandBuffer::new().fill_buffer(my_buffer /* takes ownership of the buffer */, 0).build().submit();

let my_buffer2: Marker<MyBuffer> = SubmitAfter(cb).0.buffer();    /* buffer of the command 0, totally experimental syntax */
let cb2 = AutoCommandBuffer::new().fill_buffer(my_buffer2, 0).build().submit();

This wouldn't need any runtime tracking in my_buffer because we could determine that this usage is always valid. When submitting cb2 the Marker object would indicate vulkano that a semaphore is needed with cb.

Keep in mind that this is just an idea and I have probably overlooked tons of things.

tomaka commented 7 years ago

Option A (the graph system) is exactly what professional game engines are doing.

Basically you'd put command buffers in a graph (a struct), and submit the whole graph at once.

I see two challenges with this:

tomaka commented 7 years ago

I tried implementing the schema of the opening post. Unfortunately for some technical reasons we need to be able to clone the resources passed to commands. Therefore implementing the trait on &mut T doesn't make sense if the T is clonable cheaply.

I think the trait should simply be implemented on &T.

tomaka commented 7 years ago

Better proposal: https://github.com/tomaka/vulkano/issues/385