aclysma / rafx

Multi-backend renderer with asset pipeline. The objective of this repo is to build a scalable, flexible, data driven renderer.
Apache License 2.0
640 stars 32 forks source link

bugfix: resolve crash with renderer for demo #253

Closed pollend closed 7 months ago

pollend commented 1 year ago

I have a swap chain with 5 buffers in vulkan? could the definition be expanded with a minimum looks like you're querying for the supported number of swap chain and then requesting the max number of images in the swapchain?

using the MAX_FRAMES_IN_FLIGHT against the current frame should be safe. we're just iterating over 5 frame buffers but are bound to two active frames in flight.

image

aclysma commented 1 year ago

I think the API vulkan provides allows us to specify a minimum (we pick minimum driver supports + 1 which ensures non-blocking behavior.) I don't think we get to decide anything but the minimum count. If the driver gives us 5 images we have to roll with it.

We may be able to have the sync primitive list be exactly MAX_FRAMES_IN_FLIGHT + 1 elements long regardless of number of images. That may be the more correct fix here.

pollend commented 1 year ago

I think the API vulkan provides allows us to specify a minimum (we pick minimum driver supports + 1 which ensures non-blocking behavior.) I don't think we get to decide anything but the minimum count. If the driver gives us 5 images we have to roll with it.

We may be able to have the sync primitive list be exactly MAX_FRAMES_IN_FLIGHT + 1 elements long regardless of number of images. That may be the more correct fix here.

this is under rafx api, is MAX_FRAMES_IN_FLIGHT exposed there?

using the min amount of frames specified by surface capabilities. this is the minimum number of frames that the device supports, you can specify less then this number? image

you have MAX_FRAMES_IN_FLIGHT specified in multiple places? I guess you can do this ...

aclysma commented 1 year ago

Past me (2+ years ago) was trying to avoid MAX_FRAMES_IN_FLIGHT being in rafx_api as a public symbol because it felt like an opinionated choice. But present me thinks it's probably better to just have it in rafx_api in one place like you're suggesting. It's proven too useful for various systems and trying to "hide" it is counterproductive. And pragmatically I think the only values that would ever make sense there are either 1 or 2 (double/triple-buffering.) So I'd be on board with moving it to rafx_api and removing the other occurrences if you wanted to do that. I probably will at some point if you don't, but I've had a busy few weeks and may not get to it right away.

By the way what was it exactly that was crashing? I'm surprised to hear that a driver is providing 5 images, but I wouldn't necessarily expect that to cause a crash?

pollend commented 1 year ago

Past me (2+ years ago) was trying to avoid MAX_FRAMES_IN_FLIGHT being in rafx_api as a public symbol because it felt like an opinionated choice. But present me thinks it's probably better to just have it in rafx_api in one place like you're suggesting. It's proven too useful for various systems and trying to "hide" it is counterproductive. And pragmatically I think the only values that would ever make sense there are either 1 or 2 (double/triple-buffering.) So I'd be on board with moving it to rafx_api and removing the other occurrences if you wanted to do that. I probably will at some point if you don't, but I've had a busy few weeks and may not get to it right away.

By the way what was it exactly that was crashing? I'm surprised to hear that a driver is providing 5 images, but I wouldn't necessarily expect that to cause a crash?

your render graph has 2 pipelines and I'm cycling through 5 images in my swapchain so it kind of implodes.

aclysma commented 1 year ago

Do you have a line of code or stack trace where the crash occurs?

mittpat commented 7 months ago

I may be encountering a similar issue where rotating_frame_index is 3 but static_resources.tonemap_debug_output (and static_resources.mesh_culling_debug_output) contains only 3 values. This is when running demo with the vulkan backend.

thread 'main' panicked at rafx-plugins/src/pipelines/modern/graph_generator/mod.rs:265:46:
index out of bounds: the len is 3 but the index is 3

adding a modulus operator %3 allows the demo to run correctly. (great demos btw)

mittpat commented 7 months ago

The mesa discussion here is informative: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5350

In my case, same as OP I believe, vkGetPhysicalDeviceSurfaceCapabilitiesKHR returns minImageCount : 4 and maxImageCount : 0. With rafx adding + 1 to minImageCount, so 5:0. This goes against a few assumptions in rafx, as noted in this current discussion.

aclysma commented 7 months ago

Thanks for the helpful comments and link. I think #257 could be a good fix for this. I think just adding the %3 would be perfectly fine functionally. However I think the underlying issue was having a sync primitive per swapchain image rather than per frame in flight and I think that will take care of the crash reported here.

I haven't tested this on anything but Mac + MoltenVK. If you confirm that #257 fixes the crash you saw, that would be helpful. Or otherwise I'll at least confirm #257 doesn't break windows/vulkan. I don't have a linux machine handy to test myself (and if I did it's very unlikely to be the same configuration as you have.)

aclysma commented 7 months ago

Closing in favor of #257