Closed automaticp closed 9 months ago
Hold on, I'll try to play with .gitattributes
to not ruin the language stats for the repo.
Hmm, this is not really what I had in mind. If we are going to copy all the samples verbatim, we should really just link to the glslang
repo as a git submodule.
Edit: and have some file which describes which samples in the submodule are the ones we wish to run.
I edited some files to comment out lines that are expected to not compile so that all the code is well-formed. There are some torture tests like numeral.frag that are extremely valuable but won't work without commenting out the failing cases as we don't really have a rigorous way to "expect failure on line 32" but parse the rest anyway.
I see. Are all lines which contain errors marked with a comment // ERROR
? If so, maybe we could just filter out those lines before passing them to glsl_analyzer?
There are also some files that have an OpenGL ES specific keywords guarded with a macro like so:
#ifdef GL_ES
precision mediump float;
#endif
And some files that use non-standard literal suffixes, but don't mark it as error, as that lF
case.
These would have to be filtered as well.
Plus not every glslang error is a parser error, although whether that should be tested is debatable.
Just to clarify, you don't want to copy files verbatim because of the repo pollution or some other reason? Because to me just copying the relevant files and editing/authoring them to suit the needs of the glsl_analyzer's parser is much simpler than trying to do the same with scripts.
What I'm thinking is that, ideally, the glsl_analyzer parser should be able to handle whatever GLSL you throw at it (malformed or not), and possibly emit diagnostics if there are errors on a specific line. So, if there is a sample marked with // ERROR
, the test suite should ignore any errors emitted on that line (false negatives are okay).
This could require some tweaking with the line numbers emitted in diagnostics, as I believe they use the position of the next token.
Regarding the lF
case, do you know if glslang
has any other mechanisms other than // ERROR
to determine which lines should fail? Otherwise, maybe we should just accept lF
too, just so that we match the behaviour of glslang.
Regarding the lF case, do you know if glslang has any other mechanisms other than // ERROR to determine which lines should fail?
I don't think // ERROR
is anything more than a comment. They validate their tests by comparing against known output in baseResults/
.
(Sorry, gotta take a break, I'll respond to the rest a bit later)
What I'm thinking is that, ideally, the glsl_analyzer parser should be able to handle whatever GLSL you throw at it (malformed or not), and possibly emit diagnostics if there are errors on a specific line.
I've given it some thought and I totally agree with you on that part. The way I see it, is that the parser testing can be split in two steps:
Step 1:
This is what I'm trying to do here using a shotgun-like approach, where I feed the parser a ton of known-to-be-well-formed code and expect it to not fail/find errors.
Step 2:
discard
and &&
would fail this test, since the parser would incorrectly handle scope after these failures, emitting a bunch of errors in the following code for no reason.Testing this would probably require better feedback from the parser about the error, with token, line, column, and error message information being emitted in some structured format (json, etc.), so that we could test against expected errors sort of like glslang does it's diff testing against baseResults
. This step, I think, is what you meant there by "handling malformed code".
I do not think that // ERROR
comment is a good mechanism for this though, since a comment (or a line it indicates) is not a token or a parse-tree node itself. This is also not where I would use all these glslang samples, but would instead construct samples by hand.
What I'm trying to address here is related to only Step 1, since that what seems to be asked for in #11:
There have been a few bad parser bugs recently, which could have been caught if we had more/larger test cases which exercised more parts of the code.
With 8k lines of glsl to exercise the parser against. And the bugs that were reported were all related to either crashes or parser giving errors in correct code. From personal experience, I would also probably say, that these kinds of bugs are more critical for UX, since before the fixes the tool just refused to work in files that had either discard
or &&
anywhere in them, and crashes are just a big no-no by default.
That said, Step 2 is definitely something that should be done, but only once the parser can give better feedback. Thankfully, these steps can be done independently of each other.
Otherwise, maybe we should just accept lF too, just so that we match the behaviour of glslang.
They seem to be okay with lF
, no error expected on line 79, which is where the token is. I'd be lenient here too, as well as with Lf
. Checked myself, and glslang is okay with both.
Yes, I agree on both steps, it just feels iffy to copy such a large amount of code, when we aren't sure what the licensing is, hence why I would prefer a submodule. However, I can see why we cannot add glslang
as-is due to the use of diffing, as you describe. What if we instead created our own repo containing all the test samples and used that as a submodule? That way other projects could benefit from your work as well!
On a similar note, I have been meaning to do the same with the spec/
directory.
Regarding finding crashes and similar, here we could probably use some kind of fuzzing.
What if we instead created our own repo containing all the test samples and used that as a submodule?
That sounds pretty good. I wanted to offer this option if it's just copy-pasting that was the issue. Go ahead, make a repo, I'll do a PR.
Here you go: https://github.com/nolanderc/glsl-samples
You can keep the script to run the glsl_analyzer
tests in this repo, and then replace the samples/glsl
folder with the submodule in this PR :slightly_smiling_face:
Also, in that repo, we might want to have one directory for the valid samples, and one for the invalid samples. Or what do you think?
Here you go: https://github.com/nolanderc/glsl-samples
You can keep the script to run the glsl_analyzer tests in this repo, and then replace the samples/glsl folder with the submodule in this PR 🙂
Thanks, I'll do that.
Also, in that repo, we might want to have one directory for the valid samples, and one for the invalid samples. Or what do you think?
Total agreement :+1:, was thinking the same.
Should I keep it in samples/
?
samples/
├── glsl-samples/
├── run-parser-tests.sh
└── lsp/
Or make a separate tests/
directory?
tests/
├── glsl-samples/
└── run-parser-tests.sh
samples/
└── lsp/
I’m not sure if we are going to use samples for anything but tests, so maybe just create a tests
directory. (You can remove the samples
directory entirely, lsp
is never used)
@nolanderc, see if you can accept the nolanderc/glsl-samples#1 so that I could include the submodule version with samples already present in this one.
I'm not super comfortable with submodules, so please correct me if I'm doing something wrong. It's a bit of a headscratcher.
I’ll try it out when I get back home!
btw, the script seems to be missing from this PR
btw, the script seems to be missing from this PR
Yet to add it back after a reset, will do some checks and then plug it back.
Alright, that seems to be all, will wait for a review.
Just tried it out, seems to work really well! This should help a lot going forward, thank you very much :smile:
Adds a bunch of samples for testing the parser from KhronosGroup/glslang and a small bash script for running the tests on a directory.
Depends on
--parse-file
cli option. Example usage:Returns non-zero exit code if any of the tests fail, so should be possible to automate it further.
Should help resolve #11.