Add comprehensive testing for hover and autocomplete

automaticp commented 1 year ago

These features are hard to get right, and we have seen some edge cases pop up over the last week.

As far as I understood, the current way of supplying the hover info by walking the parse-tree up is never going to be perfect, so the tests have to be permissive and be more of a warning of where the current approach is not up to the task. Still, seeing the overall picture should be highly beneficial, and should help both make compromises with the current design, as well as assist in integrating the (foreshadowed) type/semantic analysis.

So yeah we need tests.

automaticp commented 1 year ago

I'm thinking of cooking up some kind of ubershader that exercises as many glsl features as possible, and then just "hovering" all over it in tests for a start. Genius, I know.

@nolanderc, thoughts, ideas or even just a "go ahead" are welcome!

nolanderc commented 1 year ago

One issue with larger test samples is that errors further up in the file may cause more errors further down, so while an uber-shader may be good to find bugs, breaking it down into smaller minimal examples which isolate the bug is still a good idea. Other than that, go right ahead 😄

automaticp commented 1 year ago

Alright, I've played around for a bit with pytest-lsp and it seems quite decent for testing lsp-servers. Please, take a look at it and tell me if that's okay with you. I'm thinking of making a small wrapper around it to provide a more-or-less straightforward interface similar to the following:

with client.open_file(path) as file:
    file.expect_hover((24, 11), "const vec4")
    file.expect_hover((102, 13), "layout(location = 1) in vec2")
    file.expect_completion_contains((42, 10), ["gl_FragCoord", "gl_FragDepth"])
    # etc...

then stamping out a bunch of tests like this and running them with pytest as part of run-all-tests.sh.

One issue with larger test samples is that errors further up in the file may cause more errors further down, so while an uber-shader may be good to find bugs, breaking it down into smaller minimal examples which isolate the bug is still a good idea.

That's a good point. I'll probably start from smaller samples that are well-formed anyways. Then we will see how it behaves under errors.

nolanderc commented 1 year ago

Oh that’s cool! That would be great! Thinking about it, maybe the *.sh scripts should be converted to python as well, so that they are easliy run under Windows?

Regarding uber-shaders: I would like the parser to be resilient enough to handle errors better, so we could have a few tests like that.

automaticp commented 1 year ago

Well..., I just spent 4 hours trying to run something similar to subcases in pytest until I realized that you functionally just cannot do that without an extension. My head hurts.

In the end, I couldn't get it too look anything close to the declarative style I wanted (pytest is truly an "Inversion of Control Testing Framework"), but at least with subtests we can specify input/expected_output as data upfront and let the ugly parts do the work. You can take a look at how it is right now, I'll probably undo that commit later to rework it though.

Thinking about it, maybe the *.sh scripts should be converted to python as well

Should not be a problem, yeah.

Regarding uber-shaders: I would like the parser to be resilient enough to handle errors better, so we could have a few tests like that.

I realized one thing: if we specify the hover/completion cursor as absolute line/column values, then the ubershader becomes problematic for maintainig it if it ever comes to adding more code to it, since that breaks positions for the tested lines that follow the new ones. It's kinda has... O(n) complexity maintenance cost :weary:.

nolanderc commented 1 year ago

You can take a look at how it is right now, I'll probably undo that commit later to rework it though.

It took some time to grok, but it doesn't look too bad actually!

I wonder, could you programatically pass in new paths/hovers? For example, the ideal would be something like the following:

expect_responses(
    paths={ "basic.vert": ".../basic.vert" },
    hovers={ "basic.vert": [((12, 34), "const vec4"), ...] },
)
expect_responses(
    paths={ "complicated.vert": ".../complicated.vert" },
    hovers={ "complicated.vert": [((42, 17), "const WhateverStruct[2]"), ...] },
)

I realized one thing: if we specify the hover/completion cursor as absolute line/column values, then the ubershader becomes problematic for maintainig it if it ever comes to adding more code to it, since that breaks positions for the tested lines that follow the new ones. It's kinda has... O(n) complexity maintenance cost 😩.

Yeah that could be a problem... However, we could do something similar to what I did here. Essentially, we add some "marker" comments to point to a specific location, and then check that the error is reasonably close (that is, the error should be "connected" to the marker comment by whitespace, if that makes sense). You can then match the errors to the contents of the markers, and ensure that the correct errors end up with the expected markers, or something like that.

nolanderc commented 1 year ago

Usage with markers may look something like:

expect_responses(
    paths={ "basic.vert": ".../basic.vert" },
    hovers={ "basic.vert": [("/* here */", "const vec4"), ...] },
)

with a basic.vert:

const vec4 color = vec4(1,0,0,1);

void main() {
    /* here */color;
}

Although I now realize that this limits where we can hover, as we cannot put a comment in the middle of an identifier.

automaticp commented 1 year ago

I wonder, could you programatically pass in new paths/hovers?

The problem is that the files have to be passed as fixture parameters, and as the framework entrypoint is reflection-based and not execution-based, these parameters have to be at module scope (resolved when the module itself is parsed/imported/executed). There may be a way to do that by hooking into the test generation with pytest_generate_tests, but I don't want to touch that yet without a good reason, since this execution->reflection->execution hoop is big yikes and makes me dizzy trying to understand what pytest is even doing behind the scenes.

Although I now realize that this limits where we can hover, as we cannot put a comment in the middle of an identifier.

Sometimes I wonder if text files could be 3-dimensional... :weary:

Something like /* #hover-next(marker-id, column-offset-from-start-of-next-token) */, but sounds fragile, needs a mini-parser and is maybe overkill, also have to make sure that each marker-id is unique or just specify expected hover results inline in file, which are also formatted differently depending on the type of token.

That's gonna be one ugly ubershader if each line will look similar to this:

const vec3 /* #hover-next(0, "const vec3") #hover-next(2, "const vec3") */ param /* #hover-prev(0, "const vec3") */ = vec3(1.0);

Or with split lines:


const vec3 
/* #hover-next(0, "const vec3") #hover-next(2, "const vec3") */ 
    param 
/* #hover-prev(0, "const vec3") */ 
    = vec3(1.0);

nolanderc commented 1 year ago

Hmm, yeah... In an ideal world, the parser shouldn't propagate syntax errors too far down the file, so maybe ubershaders are the wrong approach after all. What we really should be testing for is that parse errors are kept close to the actual syntax error:

const int foo = /* oops, missing value.. */
// <no errors past this line>
const int bar = 124;

automaticp commented 1 year ago

I'm wondering, what is the expected behavior when hovered at the end of an identifier? Right now it does not show anything, is that intentional and should we go forward with that assumption?

For reference, clangd (C/C++ language server) shows the identifier info if the identifier is alone:

(couldn't capture the caret, but it's at the end)

but switches to showing the info of a field if it's present:

I'm hovering by placing the caret at the position and pressing the hover button, so I expect the closest identifier to be highlighted, but clangd's approach seems to be more intuitive if you're actually hovering with the mouse cursor, since you look at characters and not spaces between them.

I'm asking mostly so that I would know what to expect in tests, should it stay None when the caret is past the end?

automaticp commented 1 year ago

In an ideal world, the parser shouldn't propagate syntax errors too far down the file, so maybe ubershaders are the wrong approach after all. What we really should be testing for is that parse errors are kept close to the actual syntax error:

Hopefully, we'll tackle that in "step 2" of parser testing.

nolanderc commented 1 year ago

I'm asking mostly so that I would know what to expect in tests, should it stay None when the caret is past the end?

I'm not quite sure. Ideally we would have different behaviour based on if the hover was initiated by the mouse (must be inside the identifier) or caret (may be one past the end), but LSP does not provide this information.

If we always accept one past the end of the identifier you could hover the ; with your mouse and still get a hover for aaa (your first picture). But that might be worth the tradeoff if it makes the experience of using the keyboard better.

EDIT: personally I use neovim, and there hover in normal mode behaves as a mouse cursor, so I don't run into this personally. So I leave it to you to make the final call.

nolanderc / glsl_analyzer

Add comprehensive testing for hover and autocomplete #26