Closed bwlodarcz closed 6 months ago
@asudarsa This is a fix for memory issue which arises from major architectural flaw. The source of bug basically looks like that: I can create DebugInfo by hand in whatever standard and as Scope operation point to something unrelated. That is invalid SPIR-V from spec PoV. DebugInfo doesn't perform any validating checks, nor Parser. In addition to that, LIT testing infrastructure (regex with some anemic macro capabilities) isn't really created to performing such validation checks - it's an backend testing infrastructure (step before asm). It doesn't have enough capabilities. There are LIT tests in Translator which covers DebugTypeInheritance functionality as is. Without massive reworking DebugInfo parts in Translator covering that kind of bugs is impossible.
@asudarsa This is a fix for memory issue which arises from major architectural flaw. The source of bug basically looks like that: I can create DebugInfo by hand in whatever standard and as Scope operation point to something unrelated. That is invalid SPIR-V from spec PoV. DebugInfo doesn't perform any validating checks, nor Parser. In addition to that, LIT testing infrastructure (regex with some anemic macro capabilities) isn't really created to performing such validation checks - it's an backend testing infrastructure (step before asm). It doesn't have enough capabilities. There are LIT tests in Translator which covers DebugTypeInheritance functionality as is. Without massive reworking DebugInfo parts in Translator covering that kind of bugs is impossible.
I thought this was fixing a bug that was reported by user. May be not. My two cents: If a known issue is being resolved , it is a good idea to have an associated LIT test so that we can catch any regressions. If the test creation is going to be time consuming, I am ok with merging this, but adding the test at a later stage. Thanks
@asudarsa This is a fix for memory issue which arises from major architectural flaw. The source of bug basically looks like that: I can create DebugInfo by hand in whatever standard and as Scope operation point to something unrelated. That is invalid SPIR-V from spec PoV. DebugInfo doesn't perform any validating checks, nor Parser. In addition to that, LIT testing infrastructure (regex with some anemic macro capabilities) isn't really created to performing such validation checks - it's an backend testing infrastructure (step before asm). It doesn't have enough capabilities. There are LIT tests in Translator which covers DebugTypeInheritance functionality as is. Without massive reworking DebugInfo parts in Translator covering that kind of bugs is impossible.
I thought this was fixing a bug that was reported by user. May be not. My two cents: If a known issue is being resolved , it is a good idea to have an associated LIT test so that we can catch any regressions. If the test creation is going to be time consuming, I am ok with merging this, but adding the test at a later stage. Thanks
@asudarsa Yes, it's a bug that was reported by user. The underlying issue is not resolved. The origin of the bug is flawed and insecure architecture of the DebugInfo translation. The concrete instance of bug reported by user was solved in this PR. Origin of that concrete instance of bug was missed ParentIdx for DebugTypeInheritance but the main source of the bug is in architecture. In DebugInfo most of the fields have Scope/ParentIdx due to equivalence to DWARF format. There aren't any validating routines or passes DebugInfo. That means that any debug instruction can assign as a parent/scope any other (even completely unrelated instruction from execution) as it's parent with every consequence of that fact (including memory issues) - and translator isn't aware of that. In addition to that even if I prepare test for that concrete bug how do we catch failing memory issue when sanitizers aren't enabled in CI by default? In that light I don't see any sensile way to test (because what will we be testing and how?).
I think we can revisit the testing issue when we have sanitizer testing in place. Approving this. Thanks @bwlodarcz for your detailed answers.
Sincerely
Fixed ParentIdx was mismatched for DebugTypeInheritance type in context of NonSemantic.Shader.DebugInfo.100. That lead to heap buffer overflow in SPIRVExtInst::getExtOp getter when instruction was incorrectly casted to SPIRVExtInst as parent of the culprit instruction in SPIRVToLLVMDbgTran::getDIBuilder. The missmatch happen because in previous used standard OpenCL.DebugInfo.100 DebugTypeInheritance had Child field as zero indexed argument. In newer standard that field is removed.