KhronosGroup / Vulkan-Samples

One stop solution for all Vulkan samples
Apache License 2.0
4.29k stars 639 forks source link

Invalid VkImageCreateInfo->sharingMode validation layer error in Release builds of msaa/16bit_arithmetic samples #1207

Open JamesRumble-IMG opened 6 days ago

JamesRumble-IMG commented 6 days ago

I see the following validation layer error when running Release builds of the msaa/16bit_arithmetic samples when the samples are built using -DCMAKE_BUILD_TYPE=RelWithDebInfo -DVKB_VALIDATION_LAYERS=ON.

[error] 81114962 - VUID-VkImageCreateInfo-sharingMode-parameter: Validation Error: [ VUID-VkImageCreateInfo-sharingMode-parameter ] | MessageID = 0x4d5b752 | vkCreateImage(): pCreateInfo->sharingMode (23330) does not fall within the begin..end range of the VkSharingMode enumeration tokens and is not an extension added token. The Vulkan spec states: sharingMode must be a valid VkSharingMode value (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkImageCreateInfo-sharingMode-parameter)

This can be fixed by conditionally setting sharingMode in https://github.com/KhronosGroup/Vulkan-Samples/blob/f7e97a19378255e01b51acc211bf6b31c849a34c/framework/core/image.h#L120 i.e. create_info.sharingMode = create_info.queueFamilyIndexCount != 0 ? VK_SHARING_MODE_CONCURRENT : VK_SHARING_MODE_EXCLUSIVE;. I didn't raise a PR as I'm not particularly familiar with this codebase and I wasn't sure whether there was a reason this wasn't done originally.

Vulkan-Samples built with gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0 on Ubuntu 24.04.

Reproduced on: driverID = DRIVER_ID_MESA_LLVMPIPE driverName = llvmpipe driverInfo = Mesa 24.3~git2410240600.c0bcea~oibaf~n (git-c0bceaf 2024-10-24 noble-oibaf-ppa) (LLVM 18.1.8)

Tested on commit f7e97a19378255e01b51acc211bf6b31c849a34c.

SaschaWillems commented 6 days ago

This looks indeed like an error in the framework. I can't find the place where the create info is properly zero-initialized. Sadly it's heavily abstracted, so someone with a deeper knowledge of that builder class should take a look. @asuessenbach could you take a look?

asuessenbach commented 3 days ago

I can't reproduce this issue. And the sharingMode should in fact be initialized to vk::SharingMode::Exclusive (or VK_SHARING_MODE_EXCLUSIVE)

@JamesRumble-IMG Could you please verify, that it has some defined value in the HPPImageBuilder and the ImageBuilder constructors (hpp_image.h, line 45, image.h, line 48)?

JamesRumble-IMG commented 2 days ago

Hi, odd its readily reproducible for me.

I've included some output from gdb showing the issue below. Re HPPImageBuilder and hpp_image the image being created in this demo is a vkb::core::Image and HPPImageBuilder doesn't seem to be involved in its creation as far as I can tell. Is that expected?

vkb::core::ImageBuilder::ImageBuilder (extent=..., this=0x7fffffffce60) at Vulkan-Samples/framework/core/image.h:48 (gdb) p create_info $5 = (VkImageCreateInfo &) @0x7fffffffce90: {sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, pNext = 0x0, flags = 3914369648, imageType = 32767, format = 4294967008, extent = {width = 4294967295, height = 1449383056, depth = 21845}, mipLevels = 1, arrayLayers = 0, samples = 4294954608, tiling = 32767, usage = 4153073054, sharingMode = 32767, queueFamilyIndexCount = 4294954576, pQueueFamilyIndices = 0x7ffff78ec96d <__GI___clock_gettime+29>, initialLayout = VK_IMAGE_LAYOUT_UNDEFINED}

(gdb) backtrace

0 vkb::core::ImageBuilder::ImageBuilder (extent=..., this=0x7fffffffce60) at Vulkan-Samples/framework/core/image.h:48

1 vkb::core::Image::Image (this=this@entry=0x55555655f680, device=..., extent=..., format=format@entry=VK_FORMAT_R16G16B16A16_SFLOAT, image_usage=image_usage@entry=12, memory_usage=memory_usage@entry=VMA_MEMORY_USAGE_GPU_ONLY,

sample_count=VK_SAMPLE_COUNT_1_BIT, mip_levels=1, array_layers=1, tiling=VK_IMAGE_TILING_OPTIMAL, flags=0, num_queue_families=0, queue_families=0x0) at Vulkan-Samples/framework/core/image_core.cpp:95

2 0x0000555555dd6cfa in std::make_unique<vkb::core::Image, vkb::Device&, VkExtent3D, VkFormat, int, VmaMemoryUsage> () at /usr/include/c++/13/bits/unique_ptr.h:1070

3 KHR16BitArithmeticSample::prepare (this=, options=...) at Vulkan-Samples/samples/performance/16bit_arithmetic/16bit_arithmetic.cpp:126

4 0x00005555558f953e in vkb::Platform::start_app (this=this@entry=0x7fffffffd5b0) at /usr/include/c++/13/bits/unique_ptr.h:199

5 0x00005555558f9828 in vkb::Platform::main_loop_frame (this=this@entry=0x7fffffffd5b0) at Vulkan-Samples/framework/platform/platform.cpp:129

6 0x00005555558f98b1 in vkb::Platform::main_loop (this=this@entry=0x7fffffffd5b0) at Vulkan-Samples/framework/platform/platform.cpp:186

7 0x00005555558f18de in platform_main (context=...) at Vulkan-Samples/app/main.cpp:67

8 0x00005555558f2093 in main (argc=, argv=) at /usr/include/c++/13/bits/unique_ptr.h:199

And another snippet: vkb::core::ImageBuilder::with_implicit_sharing_mode (this=0x7fffffffce60) at Vulkan-Samples/framework/core/image.h:118 (gdb) p create_info $2 = (VkImageCreateInfo &) @0x7fffffffce90: {sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, pNext = 0x0, flags = 0, imageType = VK_IMAGE_TYPE_2D, format = VK_FORMAT_R16G16B16A16_SFLOAT, extent = {width = 1024, height = 1024, depth = 1}, mipLevels = 1, arrayLayers = 1, samples = VK_SAMPLE_COUNT_1_BIT, tiling = VK_IMAGE_TILING_OPTIMAL, usage = 12, sharingMode = 32767, queueFamilyIndexCount = 0, pQueueFamilyIndices = 0x0, initialLayout = VK_IMAGE_LAYOUT_UNDEFINED}

asuessenbach commented 2 days ago

That's strange! Let's look one level deeper: Could you please verify what you have in the BuilderBase constructor, in framework/builder_base.h, line 75? What values has the constructor argument create_info, and what has this->create_info?

Regarding HPPImageBuilder and hpp_image: those are used by some framework images (3 x depth_image, font_image).

JamesRumble-IMG commented 2 days ago

With an optimised build its fairly difficult to get the result of this->create_info on line 75 of framework/builder_base.h The constructor argument create_info looks fine though I think. (gdb) p create_info $6 = (const VkImageCreateInfo &) @0x7fffffffce00: {sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, pNext = 0x0, flags = 0, imageType = VK_IMAGE_TYPE_1D, format = VK_FORMAT_UNDEFINED, extent = {width = 0, height = 0, depth = 0}, mipLevels = 0, arrayLayers = 0, samples = 0, tiling = VK_IMAGE_TILING_OPTIMAL, usage = 0, sharingMode = VK_SHARING_MODE_EXCLUSIVE, queueFamilyIndexCount = 0, pQueueFamilyIndices = 0x0, initialLayout = VK_IMAGE_LAYOUT_UNDEFINED}

asuessenbach commented 2 days ago

Well, if the constructor argument looks fine either the copy operation of that argument fails, or something else changes some values later on. That is, maybe you could for example add some output into that function, telling us how this->create_info (or at least this->create_info.sharingMode) looks like. For example: std::cout << __FUNCTION__ << " : this->create_info.sharingMode = " << vk::to_string( this->create_info.sharingMode ) << std::endl; (would require #include <iostream>)

zhangyiwei commented 2 days ago

I ran into the same VVL violation with descriptor_management sample, and the issue is with the implicit sharing mode init. This would fix it: https://github.com/KhronosGroup/Vulkan-Samples/pull/1211

JamesRumble-IMG commented 2 days ago

That is, maybe you could for example add some output into that function, telling us how this->create_info (or at least this->create_info.sharingMode) looks like. For example: std::cout << __FUNCTION__ << " : this->create_info.sharingMode = " << vk::to_string( this->create_info.sharingMode ) << std::endl; (would require #include <iostream>)

Unfortunately the logs look fine ("BuilderBase : this->create_info.sharingMode = Exclusive")

Well, if the constructor argument looks fine either the copy operation of that argument fails, or something else changes some values later on.

I had another look though and the state of create_info and this->create_info look wrong at Vulkan-Samples/framework/core/image.h:48. This is at a point the code is expecting that they are zero-initialised (other than sType a couple of lines above).

vkb::core::ImageBuilder::ImageBuilder (extent=..., this=0x7fffffffcd60) at Vulkan-Samples/framework/core/image.h:48 48 create_info.extent = extent; (gdb) p create_info $3 = (VkImageCreateInfo &) @0x7fffffffcd90: {sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, pNext = 0x0, flags = 4294954320, imageType = 32767, format = 1486269152, extent = {width = 21845, height = 1450658992, depth = 21845}, mipLevels = 5, arrayLayers = 21845, samples = 1448230528, tiling = 21845, usage = 0, sharingMode = VK_SHARING_MODE_CONCURRENT, queueFamilyIndexCount = 1450614560, pQueueFamilyIndices = 0xffffffff00000000, initialLayout = VK_IMAGE_LAYOUT_UNDEFINED} (gdb) p this->create_info $4 = {static allowDuplicate = false, static structureType = vk::StructureType::eImageCreateInfo, sType = vk::StructureType::eImageCreateInfo, pNext = 0x0, flags = {m_mask = 4294954320}, imageType = (vk::ImageType::e2D | vk::ImageType::e3D | unknown: 0x7ffc), format = 1486269152, extent = {width = 21845, height = 1450658992, depth = 21845}, mipLevels = 5, arrayLayers = 21845, samples = (unknown: 0x56523e80), tiling = 21845, usage = {m_mask = 0}, sharingMode = vk::SharingMode::eConcurrent, queueFamilyIndexCount = 1450614560, pQueueFamilyIndices = 0xffffffff00000000, initialLayout = vk::ImageLayout::eUndefined}

I hacked in create_info = {VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO, nullptr}; on Vulkan-Samples/framework/core/image.h:48 after getting the create_info and the warning goes away.

It seems like there may be some undefined behaviour? If I had to guess, The various reinterpret_casts between types seem dubious to me.

@asuessenbach have you had any luck trying to reproduce this on your end?

asuessenbach commented 2 days ago

have you had any luck trying to reproduce this on your end?

No, I can't reproduce this behaviour. If the BuilderBase constructor looks fine, that is the Parent's initialization on line 45 in image.h, but on line 47 it looks wrong, where the heck does that come from??

Would you please use -fno-strict-aliasing with your build? Does that change anything?

JamesRumble-IMG commented 2 days ago

Would you please use -fno-strict-aliasing with your build? Does that change anything?

The validation layer issue goes away for me after compiling with -fno-strict-aliasing. We've hit these kinds of issues before. They're never fun :(.

asuessenbach commented 1 day ago

And with MSVC, you don't see anything like that, because it's no-strict_aliasing by default and there's not even a switch for it.

Are you ok with this setting?

JamesRumble-IMG commented 1 day ago

Are you ok with this setting?

Sure that's fine with me in that it "fixes" the issue. Are you suggesting using that by default in the samples? Using -fno-strict-aliasing does however mean forgoing some potential optimisation possibilities and its generally working around a fundamental issue

asuessenbach commented 1 day ago

Yes, I think, at least for now, -fno-strict-aliasing should be used by default here.

SaschaWillems commented 1 day ago

I think we should enable it to be on the safe side. Android builds use clang, and I'm not sure what's the default there. Not sure about Apple platforms though.

zhangyiwei commented 1 day ago

Since the samples are c++ codes, it'd be safer to memset at the tail of the builder class to ensure zero initialized? correctness is more important than the tiny perf diff for the samples I bet ; )