KhronosGroup / Vulkan-ValidationLayers

Vulkan Validation Layers (VVL)
https://vulkan.lunarg.com/doc/sdk/latest/linux/khronos_validation_layer.html
Other
747 stars 400 forks source link

SIGSEGV in debug_printf::CommandBuffer::PostProcess when output is truncated #8411

Closed Smjert closed 3 days ago

Smjert commented 3 weeks ago

Environment:

Describe the Issue

I'm using debugPrintfEXT("abcdef"); in a compute shader and executing my application with the env var VK_LAYER_PRINTF_BUFFER_SIZE set to 1000000. My application receives some instances of the string and then it crashes when it tells me

Validation Warning: [ WARNING-DEBUG-PRINTF ] | MessageID = 0x76589099 | vkQueueSubmit():  WARNING - Debug Printf message was truncated, likely due to a buffer size that was too small for the message

Expected behavior

No crash after warning of message truncation

Additional context

Compute shader I'm using + GDB stack trace + ASAN stack trace Compute shader code: ```sh #version 450 #extension GL_EXT_debug_printf : enable layout (local_size_x = 16, local_size_y = 16) in; layout(rgba8,set = 0, binding = 0) uniform image2D image; // License Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License. //push constants block layout( push_constant ) uniform constants { vec4 data1; vec4 data2; vec4 data3; vec4 data4; } PushConstants; // Return random noise in the range [0.0, 1.0], as a function of x. float Noise2d( in vec2 x ) { float xhash = cos( x.x * 37.0 ); float yhash = cos( x.y * 57.0 ); return fract( 415.92653 * ( xhash + yhash ) ); } // Convert Noise2d() into a "star field" by stomping everthing below fThreshhold to zero. float NoisyStarField( in vec2 vSamplePos, float fThreshhold ) { float StarVal = Noise2d( vSamplePos ); if ( StarVal >= fThreshhold ) StarVal = pow( (StarVal - fThreshhold)/(1.0 - fThreshhold), 6.0 ); else StarVal = 0.0; debugPrintfEXT("abcdef"); return StarVal; } // Stabilize NoisyStarField() by only sampling at integer values. float StableStarField( in vec2 vSamplePos, float fThreshhold ) { // Linear interpolation between four samples. // Note: This approach has some visual artifacts. // There must be a better way to "anti alias" the star field. float fractX = fract( vSamplePos.x ); float fractY = fract( vSamplePos.y ); vec2 floorSample = floor( vSamplePos ); float v1 = NoisyStarField( floorSample, fThreshhold ); float v2 = NoisyStarField( floorSample + vec2( 0.0, 1.0 ), fThreshhold ); float v3 = NoisyStarField( floorSample + vec2( 1.0, 0.0 ), fThreshhold ); float v4 = NoisyStarField( floorSample + vec2( 1.0, 1.0 ), fThreshhold ); float StarVal = v1 * ( 1.0 - fractX ) * ( 1.0 - fractY ) + v2 * ( 1.0 - fractX ) * fractY + v3 * fractX * ( 1.0 - fractY ) + v4 * fractX * fractY; return StarVal; } void mainImage( out vec4 fragColor, in vec2 fragCoord ) { vec2 iResolution = imageSize(image); // Sky Background Color //vec3 vColor = vec3( 0.1, 0.2, 0.4 ) * fragCoord.y / iResolution.y; vec3 vColor = PushConstants.data1.xyz * fragCoord.y / iResolution.y; // Note: Choose fThreshhold in the range [0.99, 0.9999]. // Higher values (i.e., closer to one) yield a sparser starfield. float StarFieldThreshhold = PushConstants.data1.w;//0.97; // Stars with a slow crawl. float xRate = 0.2; float yRate = -0.06; vec2 vSamplePos = fragCoord.xy + vec2( xRate * float( 1 ), yRate * float( 1 ) ); float StarVal = StableStarField( vSamplePos, StarFieldThreshhold ); vColor += vec3( StarVal ); fragColor = vec4(vColor, 1.0); } void main() { vec4 value = vec4(0.0, 0.0, 0.0, 1.0); ivec2 texelCoord = ivec2(gl_GlobalInvocationID.xy); ivec2 size = imageSize(image); if(texelCoord.x < size.x && texelCoord.y < size.y) { vec4 color; mainImage(color,texelCoord); imageStore(image, texelCoord, color); } } ``` GDB stack trace ``` #0 0x00007aeff15b5080 in ?? () from /usr/lib/libc.so.6 #1 0x00007aefeae37850 in debug_printf::CommandBuffer::PostProcess (this=0x5766618037d0, queue=0x57665ef253e0, loc=...) at /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/gpu_validation/debug_printf.cpp:444 #2 0x00007aefeae672a3 in gpu_tracker::Queue::Retire (this=0x57666165fc70, submission=...) at /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/gpu_validation/gpu_state_tracker.cpp:603 #3 0x00007aefeaf12ce3 in vvl::Queue::ThreadFunc (this=0x57666165fc70) at /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/state_tracker/queue_state.cpp:228 #4 0x00007aefe90e1cf4 in std::execute_native_thread_routine (__p=0x57666186c040) at /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104 #5 0x00007aeff14dc39d in ?? () from /usr/lib/libc.so.6 #6 0x00007aeff156149c in ?? () from /usr/lib/libc.so.6 ``` ASAN stack trace: ``` ================================================================= ==147075==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7384b5c40000 at pc 0x5bc91110c0dc bp 0x7384b7adeb10 sp 0x7384b7ade2d0 WRITE of size 50331656 at 0x7384b5c40000 thread T2 #0 0x5bc91110c0db in memset (/home/smjert/Development/games/svk-engine/build/svk-engine+0x1430db) (BuildId: 94854b000f2ba757ac9f44388197857638600933) #1 0x7384bf03784f in debug_printf::CommandBuffer::PostProcess(VkQueue_T*, Location const&) /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/gpu_validation/debug_printf.cpp:444:57 #2 0x7384bf0672a2 in gpu_tracker::Queue::Retire(vvl::QueueSubmission&) /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/gpu_validation/gpu_state_tracker.cpp:603:36 #3 0x7384bf112ce2 in vvl::Queue::ThreadFunc() /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/state_tracker/queue_state.cpp:228:15 #4 0x7384ca4e1cf3 in execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104:18 #5 0x5bc91104b46d in asan_thread_start(void*) asan_interceptors.cpp.o #6 0x7384d8ab139c in start_thread /usr/src/debug/glibc/glibc/nptl/pthread_create.c:447:8 #7 0x7384d8b3649b in __GI___clone3 /usr/src/debug/glibc/glibc/misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 0x7384b5c40000 is located 6144 bytes before 454656-byte region [0x7384b5c41800,0x7384b5cb0800) allocated by thread T0 here: #0 0x5bc91110f2f9 in calloc (/home/smjert/Development/games/svk-engine/build/svk-engine+0x1462f9) (BuildId: 94854b000f2ba757ac9f44388197857638600933) #1 0x7384d3be8abd (/usr/lib/libnvidia-glcore.so.550.90.07+0x9e8abd) (BuildId: 9d30aa4c6f2a8da11f1ae2fe6d322d76dbc669f3) Thread T2 created by T0 here: #0 0x5bc911106b93 in pthread_create (/home/smjert/Development/games/svk-engine/build/svk-engine+0x13db93) (BuildId: 94854b000f2ba757ac9f44388197857638600933) #1 0x7384ca4e1df1 in __gthread_create /usr/src/debug/gcc/gcc-build/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:676:35 #2 0x7384ca4e1df1 in std::thread::_M_start_thread(std::unique_ptr>, void (*)()) /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:172:37 #3 0x7384bf112b64 in std::thread::thread(void (vvl::Queue::*&&)(), vvl::Queue*&&) /usr/include/c++/13.2.1/bits/std_thread.h:164:17 #4 0x7384bf112b64 in std::__detail::_MakeUniq::__single_object std::make_unique(void (vvl::Queue::*&&)(), vvl::Queue*&&) /usr/include/c++/13.2.1/bits/unique_ptr.h:1070:30 #5 0x7384bf112b64 in vvl::Queue::PreSubmit(std::vector>&&) /usr/src/debug/vulkan-validation-layers/Vulkan-ValidationLayers-vulkan-sdk-1.3.275/layers/state_tracker/queue_state.cpp:89:56 SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/smjert/Development/games/svk-engine/build/svk-engine+0x1430db) (BuildId: 94854b000f2ba757ac9f44388197857638600933) in memset Shadow bytes around the buggy address: 0x7384b5c3fd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x7384b5c3fe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x7384b5c3fe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x7384b5c3ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0x7384b5c3ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 =>0x7384b5c40000:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x7384b5c40080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x7384b5c40100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x7384b5c40180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x7384b5c40200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x7384b5c40280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb ==147075==ABORTING ```
Smjert commented 3 weeks ago

I did some additional debugging, compiling the validation layers by myself in debug mode. The crash happens on this memset: https://github.com/KhronosGroup/Vulkan-ValidationLayers/blob/780c65337e111c7385109c7b720d757a778e4fe2/layers/gpu_validation/debug_printf.cpp#L413

I've stepped through the loop that processes the output records and I've done the following observations:

  1. Since the start the expected record count is very high, 0xC00000 precisely
  2. The record index (the index variable) goes up to 254 before getting the truncation error
  3. The record size in OutputRecord is always 4 (words)

But most of all, it seems to me that the memset assumes whatever the GPU says it sent, as a record count, is correct, so uses that to clear out the buffer, even if the allocated buffer is actually smaller, or the parsed count of records is less than what's reported.

Shouldn't it at least be clamped to the maximum allocated buffer size? And then maybe use the parsed count, instead of the expected count, to calculate the memset size?

spencer-lunarg commented 3 weeks ago

so the DebugPrintf code was reworked in the last SDK to not need SPIRV-Tools as a dependency, and looking at this, this seems like it wouldn't be an issue anymore

Can you please grab the latest 1.3.290 SDK version of Validation Layers and confirm if this is still an issue?

Smjert commented 3 weeks ago

so the DebugPrintf code was reworked in the last SDK to not need SPIRV-Tools as a dependency, and looking at this, this seems like it wouldn't be an issue anymore

Can you please grab the latest 1.3.290 SDK version of Validation Layers and confirm if this is still an issue?

Tested now with that version and there's still the same problem. I had to decrease a bit the print buffer size set via VK_LAYER_PRINTF_BUFFER_SIZE to get the truncation message again though. The reported record count, as read from the debug_output_buffer is still 0xC00000.

Overall I see that 1.3.290 and main both still use the same logic for that memset, trusting what the buffer says.

spencer-lunarg commented 3 weeks ago

ok, thanks for confirming it is still an issue with the latest code, I should have time this week to look more into it when I am touching other DebugPrintf stuff

spencer-lunarg commented 6 days ago

@Smjert thanks again for raising this (and doing most of the debugging for me :laughing:)

I wrote a test to reproduce and clamped the memset as it was a clear bug there