Open ChristianReinbold opened 1 year ago
The Conditional jump or move depends on uninitialised value(s) at 0x69931D0: clang::TypeLoc::getBeginLoc() const (TypeLoc.cpp:195)
is also present when compiling to dxil, so this isn't specific to the SPIR-V backend. Not yet sure how to debug it but I'll move it to general triage for now.
Description
When compiling a simple spir-v raygen shader sampling a texture with a linux build of DXC, Valgrind's memcheck identifies several conditional branches on uninitialized memory. In recent times, we regularly faced sporadic dxc crashes on Windows after applying seemingly harmless modifications to our production codebase (e.g. repeat a redundant assignment operation), which cannot be reproduced reliably on different dev machines. We suspect some corrupt memory to be the culprit, which is why we started investigating with memcheck. The memcheck issues found in our production codebase match those in the minimal working example described below.
Steps to Reproduce
Assuming you have a valid dxc & valgrind in your path, run
where shader.hlsl is placed in the current working directory and contains
Actual Behavior
Valgrind outputs
My own investigation boils down to two issues:
an uninitialized ID in the source location accessed by TypeLoc::getLocalSourceRange()::getBegin(). Unfortunately, the ID seems to come out of some opaque data blocked accessed via a void pointer. It looks strange to me that - when debugging - the Class of the TypeLoc is clang::TypeLoc::Record, but the memory block in which the uninitialized ID resides in is allocated when constructing a VarDecl. However, I am not aware how clang's custom memory management works, so maybe this mismatch can be explained. If I have to guess, some casting or pointer arithmetics is going wrong. At least I was not able to find any obviously missing initializations of members when constructing a VarDecl. I kind of feel blocked now in continuing to resolve this. It would be great if someone with more experience of the dxc & clang codebase could take over. Note that I have not looked into the memory leaks reported by Valgrind. They seem to be related to one-time setup & teardown of global state, and thus will not impact the stability of dxc.
When processing arguments of the SampleLevel intrinsic an out-of-bounds access is happening in SemaHLSL.cpp:6465
if (Template[pIntrinsic->pArgs[0].uTemplateId] == AR_TOBJ_OBJECT)
Turns out this has been already fixed on the main branch. Would be great if we see this change in the next release.Environment DXC debug build with v1.7.2308 sources on RedHat 8.6 (reproduces both issues) DXC debug build with main (hash ceff9b804) sources on RedHat 8.6 (only reproduces first issue, not second)