KhronosGroup / glslang

Khronos-reference front end for GLSL/ESSL, partial front end for HLSL, and a SPIR-V generator.
Other
2.9k stars 815 forks source link

error in combination with preprocessor directives can obfuscate error location. #3544

Open phr34k opened 3 months ago

phr34k commented 3 months ago

I'd like to address an issue with error handeling specifically in combination with common preprocessor directives like #include #line. I'll give you a short summary of the problem and how these system interact, and the situation of the problem.

layout( location = 0 ) out vec4 color1
#line 500
layout( location = 0 ) out vec4 color2:

Results in an "ERROR: frag.glsl:500: unexpected LAYOUT, expecting COMMA or SEMICOLON.". Strictly spoken the error message is correct. However it highlight an important issue, preprocessor directives can cause unterminated statements easily transfer accross natural file boundaries (e.g. #include directives) which makes these source of these statements near impossible to locate, especially in larger code bases where reusabiliity is imminent this class of errors are easilly obfuscated.

layout( location = 0 ) out vec4 color1
layout( location = 0 ) out vec4 color2:
#include "a"
#include "b"

In the above scenario, the origional statement occurs in "a" but the factual error is only detected in "b". I'd propose that unterminated statements could report the location where the statement begins, and the specific token or (sub-)statement location reported as detail(s).

I have some past expierence in compilers and language(s) generation but haven't used yacc before so I'd be happy just get a few pointers. I believe this error is presently automatically generated from the grammer (glslang.y) and isn't explictly generated/handled inline. I suspect the location (simple_statement) needs to propegate so that they can be used during error handling.

arcady-lunarg commented 3 months ago

Yeah, I think ultimately the way to deal with this is to keep track of the start of the unterminated statement. Part of the issue is that #include processing happens in the tokenizer, at a level below that of the glslang.y bison parser. Error reporting is definitely an area where glslang could use considerable improvement, patches are definitely welcome.

phr34k commented 3 months ago

@arcady-lunarg good to hear you concur with my line of thought. Do you have any highlevel pointers in terms of 'glslang.y', it's hard to understand thirdparty code at face value, so I'm wondering what generally would work best.

What I had in mind:

Would this be a solid approach, keeping in mind, I'd like to refrain from seriously affecting other errors. Do you have an alternative/better suggestion?

arcady-lunarg commented 3 months ago

I recommend reading up on how bison generated grammars work, but the key glslang-specific pieces you want to look at here is the TParseContext class, and the interm struct in glslang.y, and in particular its loc member of type TSourceLoc.

phr34k commented 3 months ago

@arcady-lunarg I have prepared a pull request of the initial changes I drafted, it'd be great if you could have a look to see if you had any comments on it: https://github.com/KhronosGroup/glslang/pull/3559