webgpu-native / webgpu-headers

https://webgpu-native.github.io/webgpu-headers/
BSD 3-Clause "New" or "Revised" License
377 stars 43 forks source link

Holding pass dependencies alive #29

Open kvark opened 4 years ago

kvark commented 4 years ago

Render passes are expected to hold majority of the function calls coming from the most complex applications. It's in everybody's best interest to make encoding render passes to be light and performant.

One of the aspects of the JS spec that complicates it is tracking dependencies. Something used in a render pass may be deleted before it has a chance to be used, and it's the responsibility of the spec implementor to make sure the render pass as well as command encoder holds those dependencies alive.

For native targets though, we could make encoding simpler if this responsibility was put on the shoulders of the users. For example, in Rust it's easy to enforce the dependency life times at compile time, thus avoiding the overhead completely.

Kangz commented 4 years ago

Does this mean that the following should cause a segfault with RAII?

pass = encoder.BeginRenderPass(...);

// imagine-this-is-a-function-scope
{
  wgpu::BindGroup myBindGroup = ...;
  pass.SetBindGroup(0, myBindGroup);
}

pass.EndPass();
wgpu::CommandBuffer commands = encoder.Finish();
queue.Submit(1, &commands);

The bindgroup is created just before the SetBindGroup call and doesn't need to live more than that, so its destuctor runs, removing the last reference / destroying the bindgroup. Then when the commands are submitted we crash because there is a use-after-free on the bindgroup.

Forcing to keep the bindgroup alive until submit is unfortunate. How about having the render pass internally reference the resources it uses?

kvark commented 4 years ago

Does this mean that the following should cause a segfault with RAII?

It doesn't have to be segfaulting, it can just controllably panic and/or produce an error.

Creating bind groups within a pass scope, without any attempt to store than for reusal, is a bad idea. I don't think we should go out of our way to make it possible. Creating vertex/index buffers is slightly more reasonable, but hopefully people stop doing that once the data upload story of WebGPU is better.

How about having the render pass internally reference the resources it uses?

My problem with this is - overhead. It's the overhead we don't have to pay, most clients will not need or want to pay, and it's only going to help genuinely anti-pattern use cases.

Kangz commented 4 years ago

Does this mean that the following should cause a segfault with RAII?

It doesn't have to be segfaulting, it can just controllably panic and/or produce an error.

So we'd have to check on submit that all the objects used in the command buffer are alive? That is only tractable if you store object as ID + generation, not if objects are individually allocated.

Creating bind groups within a pass scope, without any attempt to store than for reusal, is a bad idea. I don't think we should go out of our way to make it possible. Creating vertex/index buffers is slightly more reasonable, but hopefully people stop doing that once the data upload story of WebGPU is better.

Here's some issues with the semantics proposed:

How about having the render pass internally reference the resources it uses?

My problem with this is - overhead. It's the overhead we don't have to pay, most clients will not need or want to pay, and it's only going to help genuinely anti-pattern use cases.

These are real patterns and need to be supported. Refcounting is important for WebGPU to be a safe API to use, otherwise we get in Vulkan "do it right or crash" territory.

kvark commented 4 years ago

Important clarification: this issue only talks about the "pass" scope, so references need to be kept alive just until end_pass() is called. This is substantially less risky than the point of submit(). With this considered, some of the voiced concerns are either lowered (composability) or wiped out (multi-threading).

Not every project is a high-performance engine batching and pre-computing things.

I think it's an easy win in a compromise between convenience and performance. You can always make things more convenient, e.g. by using a higher level abstraction, but you can't make things faster.

Refcounting is important for WebGPU to be a safe API to use, otherwise we get in Vulkan "do it right or crash" territory.

It appears to me that your biggest concern here is that the pointer-based implementations would crash, as opposed to gracefully erroring out. That's indeed worrying.

It means that bindings for WebGPU in other languages have to replicate the tracking in the bindings or they risk panics for programs that are otherwise well-typed.

Could also rely on the refcounting here, trivially? Don't have to pay the cost in all wrappers though. Rust wouldn't want that. And higher-level engines have an ability to work with less overhead.

Kangz commented 4 years ago

Important clarification: this issue only talks about the "pass" scope, so references need to be kept alive just until end_pass() is called.

Ah I had misunderstood. But what's the gain you hope to get by requiring objects are kept alive for the duration of the recording of the pass? wgpu-core would still need to refcount all the objects used in the pass on EndPass.

kvark commented 4 years ago

In our case, these objects get refcounted by the usage trackers in end_pass anyway. So forcing us to also refcount them during the pass recording can be seen as a wasted work.

kainino0x commented 4 years ago

In order to refcount++ in end_pass you'd have to be holding onto the resource anyway. There only has to be one refcount++ anyway, in one place or the other. If it's done during pass recording then can't the ref just move from the pass encoder to the command buffer encoder?

kvark commented 4 years ago

It's more nuanced than that. We have basically a choice of 3 levels of dependency tracking:

Kangz commented 4 years ago

Just so I understand: shallow tracking tracks only bindgroups during the pass recording (and buffers used in standalone as vertex / index buffers) then in end_pass for each bindgroup the resources are refcounted and their usage tracked.

That sounds fine, and if the bindgroups references their resources then it is equivalent to deep tracking.

kainino0x commented 1 year ago

@cwfitzgerald reports that with Arcanization this will no longer be a concern in wgpu, and we can just have passes take references to the things passed into them.

kainino0x commented 1 year ago

Nothing changes in the API so "has resolution" is just to document this. We don't have a place to do so yet. We should probably add doxygen comments to the header or something, but for now it can be a new markdown doc in the repo that describes these details. (There's a ton of other stuff that needs to be documented in there too but we can start it whenever someone gets to it.)