Add hover and completion tests

automaticp commented 8 months ago

This PR implements a mini testing framework for hover and autocomplete for glsl_analyzer. It is based on pytest and pytest-lsp packages. Some points about the design:

The tests and are defined as expectations that are matched against the result received from an LSP request;
The exact testing logic and interpretation of expectations depends on the request and the type of token being inspected;
Tests can be marked as "expected to fail" in strict mode (if it's expected to fail, succeeding is an error);

Should help resolve #26.

@nolanderc, please review this one.

Start with lsp_testing_input.py, it shows how you define tests in terms of expectations. Take note of how ExpectFail and TokenKind are used to control the flow of tests. Then move on to testing_utils.py to see the primitives used for wrapping data and communicating testing conditions. Lastly, test_lsp.py contains the testing logic itself, arguably the ugliest part of this whole thing, but it has to be written once at best per request type/token combination.

I also added the glsl samples that I wrote for these tests into this repo as part of the PR. I realized that submodules are probably not the best idea for the type of workflow these tests aiming at. Updating the tests with submodules happens in the following steps:

Update tests on a local fork/branch of glsl-samples
Submit a PR for glsl-samples
Accept the submitted PR
Pull the glsl-samples submodule in the local branch of the glsl_analyzer
Commit the submodule update and push
Submit a PR for glsl_analyzer with the new tests.

Now imagine that in the process of this glsl_analyzer PR a bug was found; we fixed it, and we now want to add a test for that bug as part of that PR. We have to do steps 1-5 again. This is cumbersome and this will come up every time we want to update the tests. Also, if the CI is set-up to run the tests, you are always forced to update them; even if it is a bugfix, you'd have to remove ExpectFail.

I suggest that we keep most of the tests we write in this repo and reserve glsl-samples for tests that we dump with no intention to modify/extend them frequently.

nolanderc commented 8 months ago

I suggest that we keep most of the tests we write in this repo and reserve glsl-samples for tests that we dump with no intention to modify/extend them frequently.

That makes sense. Agreed.

automaticp commented 8 months ago

Thanks for checking out the code!

I had to rewrite the TokenKind into something that describes not just a token, but a token in a context, so that we could tell the identifier-declarations from identifier-expressions, for example.

Sorry, I'm a bit busy this week, so I didn't have time to make a README yet, I'll write something up explaining this whole mess soon, promise. If you want, you can try running tests by installing requirements.txt and just running pytest tests/, it's super simple.

automaticp commented 8 months ago

Alright, it might be a little overkill, but there you go, the README.md is done!

nolanderc commented 8 months ago

Haha, that’s awesome work! Should be a huge help! I will make sure to try it out later!

nolanderc commented 8 months ago

I can't seem to see the tests which are failing:

➜  tests git:(lsp-tests) ✗ pytest
============================================================================================ test session starts =============================================================================================
platform linux -- Python 3.11.6, pytest-7.4.2, pluggy-1.3.0
rootdir: /home/christofer/dev/glsl_analyzer/tests
plugins: typeguard-3.0.2, subtests-0.11.0, asyncio-0.21.1, lsp-0.3.1
asyncio: mode=Mode.STRICT
collected 12 items

test_lsp.py ,,,,.,,,,,,,,,,,,,,,,,,,,.,,,,,,,,,,,,.,,,,,,,,,,,,,,,,,,.,,x,x.,,,,,,,,,,,,,...,,,,,.,,,,x.,,xxx.,,,,,,,.                                                                                 [100%]

============================================================================= 12 passed, 6 xfailed, 88 subtests passed in 0.82s ==============================================================================

If I add the --cli-log-level=error I can see the tests, bet also the successful ones:

test_lsp.py::test_completion[opened_file3] PASSED                                                                                                                                                      [ 83%]
test_lsp.py::test_completion[opened_file4] [hover-and-completion/functions.frag:30:8] SUBPASS                                                                                                          [ 91%]
test_lsp.py::test_completion[opened_file4] [hover-and-completion/functions.frag:33:8] SUBPASS                                                                                                          [ 91%]
test_lsp.py::test_completion[opened_file4] XFAIL (hover-and-completion/functions.frag:36:8)                                                                                                            [ 91%]
test_lsp.py::test_completion[opened_file4] XFAIL (hover-and-completion/functions.frag:39:8)                                                                                                            [ 91%]
test_lsp.py::test_completion[opened_file4] XFAIL (hover-and-completion/functions.frag:42:8)                                                                                                            [ 91%]
test_lsp.py::test_completion[opened_file4] PASSED                                                                                                                                                      [ 91%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:3:7] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:4:7] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:5:5] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:6:6] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:7:4] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:8:7] SUBPASS                                                                                                            [100%]
test_lsp.py::test_completion[opened_file5] [hover-and-completion/comments.frag:29:8] SUBPASS                                                                                                           [100%]
test_lsp.py::test_completion[opened_file5] PASSED                                                                                                                                                      [100%]

============================================================================= 12 passed, 6 xfailed, 88 subtests passed in 0.83s ==============================================================================

But that doesn't really explain why the test is failing (what did the test expect and what was found?).

What is the proper way to invoke pytest to only see failing tests? Could we maybe add a pytest.ini file to make that the default?

automaticp commented 8 months ago

xfail/xfailed tests are the ones that are "expected to fail", so they don't trigger a report since that's the expected behavior. I don't think there's a native way to capture the expected/result from an xfail, but we can embed our own string alongside file:line:column information. xfail is communicated by throwing an exception, so whatever we send along with it ends up in the log. You can show xfails with (found here):

$ pytest -rx

You can also always find them marked ExpectFail in the lsp_tests_input.py.

You should be able to see the expected vs result in most tests if they actually fail, but I'm going to think how to communicate that better in cases where we don't just compare values, but instead test for a property like "expected should be a subset of result". It's actually pretty dumb of me that I didn't think about it since I'd just get that info by dumping locals with -l, but that is not very user-friendly.

nolanderc commented 8 months ago

xfail/xfailed tests are the ones that are "expected to fail",

I see, that makes sense. I guess it's fine in that case :smile:

I'm going to think how to communicate that better in cases where we don't just compare values, but instead test for a property like "expected should be a subset of result".

Yes, that's reasonable to add at some point! However, I'm willing to merge this in its current state, but feel free to open a new PR if you have any extensions! :cat:

automaticp commented 8 months ago

Yes, that's reasonable to add at some point! However, I'm willing to merge this in its current state, but feel free to open a new PR if you have any extensions! 🐱

Thanks, yeah, I have a few ideas, but I've been too busy to think it through. Sorry.

It's not as bad right now, though. Currently, if we test these properties of expected/result and the test fails, the traceback will report something like:

assert has_no_duplicates and is_expected_overload_set
assert False and True

Thanks to the aptly named boolean values you can tell which condition failed the assert. Unfortunately, the values of expected/result will not be printed by default, but I think for now it's good enough to run pytest with -l to dump all locals on traceback if you're planning to run this on the CI and want to know the context from the logs.

nolanderc / glsl_analyzer

Add hover and completion tests #45