KhronosGroup / Vulkan-Samples

One stop solution for all Vulkan samples
Apache License 2.0
4.33k stars 648 forks source link

Fix for crash on swapchain recreation #982

Closed jherico closed 8 months ago

jherico commented 8 months ago

Description

Attempting to re-create the swapchain in any of the non-HPP examples will currently trigger a crash. This seems to have been caused by an interaction between the implications of #910 and the changes in the vkb::core::Image class layout in #906.

910 refactors the sample code so that the render context is always an instance of HPPRenderContext, but depending on whether it's a C or C++ binding it will sometimes be cast to a vkb::RenderContext type. These types are not at all related, and this cast essentially relies on 100% memory compatibility between the two classes (and everything they do with their parameters and member).

906 made extensive modifications to the Image and HPPImage classes, moving many of the individual members into VkImageCreateInfo and vk::ImageCreateInfo members respectively. However, critically, the subresource and views members were re-ordered in Image but not in HPPImage. This caused them to no longer be memory layout compatible.

So when C samples call get_render_context().update_swapchain(swapchain_image_count);, the result is that it calls RenderContext::update_swapchain and RenderContext::recreate but that in turn delegates to the code in HPPRenderContext::create_render_target_func.

RenderContext::recreate specifically creates a vkb::core::Image instance and then passes it to create_render_target_func via a std::move where it shows up as a vkb::core::HPPImage&&. Because of the re-ordering of the members subresource and views between the two classes, this essentially produced a corrupted object, and the code then tried to iterate across the views object, which was pointing at random memory.

This PR fixes the problem by re-ordering the members of vkb::core::Image so that they match the order of vkb::core::HPPImage. However, while this is a proximal fix, I think that the underlying assumption that the C/C++ classes are in many cases supposed to be perfectly memory compatible isn't made very explicit, and it's easy to make a small change which appears to work fine under most circumstances, but will fail unexpectedly when the code suddenly passes through an expected C/C++ conversion code path as in this case. Basically, it seems like without extensive tests or static asserts of some kind, this is a minefield where the slightest change to the members of an HPP class and not it's corresponding C class (or vice versa) creates a potentially difficult to debug problem. Just this week I casually mentioned on another PR that a change to the Device & PhysicalDevice classes should probably be mirrored in the HPPDevice & HPPPhysicalDevice, but I only said it because I thought it might be confusing to have them diverge in this way. I didn't actually realize that such a divergence would actually be a source of bugs via this casting.

I understand the desire to have the framework show usage for both the C and C++ bindings, but I also think that the danger from casually casting between these types is pretty high. I also feel like the framework classes should just exclusively use the C++ bindings to the fullest extent, and delegate to individual samples to demonstrate the equivalent C bindings usage. Maybe there should be a few more examples showing the very basics of device initialization and swapchain creation without relying on the framework at all so that we can have nearly 100% coverage of the C binding usage without having to actually employ it in the framework anywhere.

Fixes #981

General Checklist:

Please ensure the following points are checked:

SaschaWillems commented 8 months ago

Thanks a lot for the PR. Very much appreciated 👍🏻

I totally agree with you that requiring memory layout compatibility is a big no-no and it's bound for trouble. Probably won't take long for someone working on one part of the framework to break things again 😟

Tbh. most people use the C-Interface of Vulkan, so going full C++ is not something I'd like to pursue. We will do that for the framework, but not for the non-hpp samples.

But seeing all the problems we've been encountering recently I'm not sure if that's the right way to go. After all it's the samples people are interested in and not (really) the framework, so we might overthink our decisions in that regard. Maybe should focus our time and effort more on samples instead of trying to artificially "fix" the framework.

jherico commented 8 months ago

Tbh. most people use the C-Interface of Vulkan, so going full C++ is not something I'd like to pursue. We will do that for the framework, but not for the non-hpp samples.

Sorry if I wasn't clear, I'm absolutely not advocating for switching to C++ for the C bindings samples, In fact I think there should be more C bindings samples to cover areas currently not explicitly covered by the samples, if we were to go full C++ for the framework.

SaschaWillems commented 8 months ago

No worries. My comment was aimed at our own direction ;)

gary-sweet commented 8 months ago

I fell foul of the memory-layout compatibility issue way back when reviewing https://github.com/KhronosGroup/Vulkan-Samples/pull/479 (June 2022). I commented at the time that is was an accident waiting to happen. Seems like it's happened, again.

marty-johnson59 commented 8 months ago

merging on 3