bevyengine / bevy

A refreshingly simple data-driven game engine built in Rust
https://bevyengine.org
Apache License 2.0
35.06k stars 3.44k forks source link

Compute Shader separate from renderer #5024

Open jafioti opened 2 years ago

jafioti commented 2 years ago

Hi, I have a scenario where occasionally I need to generate a large mesh. I want to use a compute shader to run a noise function and marching cubes. The only compute shader example I see is directly tied to the renderer as a node on the graph. I need my compute shader to be dispatched entirely separate from rendering, since it only happens every so often as determined by external game logic.

I would propose a system where the shader can be created, buffers would be loaded, and it would be dispatched, which would return a handle. This handle can be a future which can be polled, or it can have some other non-rust-async design, but it should have a flag to check if the shader has finished running. The handle can be stored in a resource or in a component, and on another frame later on the handle can be checked to see if the shader is finished. If so, data can be copied back from the buffers and used.

This seems like a much more general use of compute shaders than the current system which is very specialized and imo doesn't address most use cases of compute shaders.

EDIT: I should note that Unity has this feature, which is what I based my proposal off of. Obviously I wouldn't expect a direct copy for bevy, but something along these lines would be huge for any game requiring a lot of computation. For furthur reading: https://docs.unity3d.com/Manual/class-ComputeShader.html

Andrewp2 commented 2 years ago

I just want to note - you technically can run compute shaders in the main App world right now, entirely separate from Rendering. You have access to RenderDevice and RenderQueue through resources, and you can use them for compute shader purposes without ever using the RenderGraph or putting it in a Node.

The problem of course is, right now if you wanted to do this you would have to manage everything yourself. CommandEncoders, setting up pipelines, calling submit to run the compute shaders, checking on the future yourself, etc. So I'm 100% on-board with having some ergonomic solution to this. Right now you would basically have to write raw wgpu code to make it work.

jafioti commented 2 years ago

@Andrewp2 Is there any example of doing this? I can't find any in the repo examples.

davawen commented 1 year ago

Are there any updates on this?

Compute shaders are great for many complex effects and systems and I feel like bevy's support for them is really boilerplate-y and subpar compared to other games engine

Kjolnyr commented 1 year ago

I've made a plugin named bevy_app_compute which let you handle compute shaders from App World. It's still in an early stage but it works quite well. It might be helpful for algorithms like marching cubes until we find a way to handle compute shaders more ergonomically in Bevy

jafioti commented 1 year ago

@Kjolnyr Thats awesome! Do the shaders run async? Like if they take more than once frame to run, does it delay the game, or will the results just not be available until it's done?

jafioti commented 1 year ago

Also, how can the shaders be ram multiple times with different data?

Kjolnyr commented 1 year ago

No to the first and yes to the second ! It cannot be async due to wgpu's current limitations unfortunately

jafioti commented 1 year ago

@Kjolnyr What changes would be necessary to not wait the next frame on the compute shader? I basically want to pipeline the generation of chunks with marching cubes, and they can take longer than the rest of the frame to finish rendering, resulting in freezes. I'm hoping it should be possible to entirely decouple this from rendering, and just dispatch a shader one frame and get it back several down the line, like how Unity does it.

Kjolnyr commented 1 year ago

It's impossible to decouple from the rendering unfortunately as wgpu is using a single queue where you put your GPU work in, so compute work as to squeeze in, aside render work. It would be possible if we used solely Vulkan as backend API for instance

see #8440 known limitation part:

The current implementation of wgpu only has one interal queue for submit() calls. Therefore, we still have to know where and when to put our compute submit() calls, as they will run sequentially with the render queue. I know that vulkan has a separate compute queue, and I'm hoping that one day, wgpu will too. I just don't know how feasible / mature this multi-queue system is for other backends wgpu implements.

jafioti commented 1 year ago

Oh wow that's surprising, this seems like a really common use-case. Anyone needing to do heavy compute in a game can't really do so in wgpu. I guess it's not as common as I thought.

fintelia commented 1 year ago

If you need to do heavy compute, you can always break it up into small pieces and run a little bit every frame until it is done. AFAIK, Vulkan doesn't let you prioritize different queues so even with multiple queues your long running compute work could starve the rendering tasks needed to draw the next frame, and you'd need to throttle dispatches to the compute queue to avoid it.

jafioti commented 1 year ago

@fintelia Yeah I'm going to have to. I'd imagine vulkan does that as well, but it does the chunking automatically, in time to render the next frame.

Will need to come up with a way to dynamically measure how much work can be done in between frames depending on the hardware capabilities.

fintelia commented 1 year ago

I'd imagine vulkan does that as well, but it does the chunking automatically, in time to render the next frame.

I've never seen anything like that mentioned as being a feature of Vulkan.

jafioti commented 1 year ago

Do you know how unity does it? I'm able to schedule a ton of compute shader jobs async and get back the results on later frames

fintelia commented 1 year ago

I don't know what Unity does. I'd guess it is some combination of: