elm-tooling / elm-language-server

Language server implementation for Elm
https://www.npmjs.com/package/@elm-tooling/elm-language-server
MIT License
419 stars 65 forks source link

Yet More Erratic / Broken Behaviour #878

Closed FraserLee closed 9 months ago

FraserLee commented 1 year ago

When using elm-language-server version 2.6.0 in neovim, both with COC and with native LSP support, the project will randomly be covered in errors. Other times (more frequently), nearly every top level definition (including main) is flagged with an Unused pattern variable warning. Typing in or saving the file seems to randomly both induce and relieve this issue, and its occurrence seems to be more frequent on larger projects.

I've attached two screenshots, the first with COC and the second with the native LSP. In addition, the actual installation of elm-language-server varied between the two (the first done globally on my system through npm, the second handled by mason), so it's unlikely that was the root cause.

This is in a project that compiles and runs without error:

Screen Shot 2023-01-17 at 1 08 42 AM Screen Shot 2023-01-17 at 1 18 56 AM

As this seemed similar to #503 and #598 I tried all proposed solutions there, none seeming to work. Sorry for this being a bit vague, if there's more specific information that would be helpful please let me know how to provide it!

Cheers

razzeee commented 1 year ago

This does not happen right after init does it?

FraserLee commented 1 year ago

@razzeee It does seem to happen when opening, but it also recurs when editing the file.

Mehgugs commented 1 year ago

This happens for me in vscode quite often, and it seems to disappear upon saving changes (maybe).

maca commented 1 year ago

I am trying helix to see if there is better lsp experience than with emacs, but I also constantly get random whole page errors.

https://user-images.githubusercontent.com/29857/223286437-2a2a8761-f7f2-4587-9a52-4bc70218856f.mov

bburdette commented 1 year ago

I get this in helix. Also in helix the language server has some kind of problem where it can take 10 or 20 seconds to save a file. :lsp-restart sometimes fixes it. I have to :lsp-restart every 10 mins or so to get definition lookup and etc.

maca commented 1 year ago

@bburdette The issue with saving hanging for a few seconds, seems to be because of LS response times for formatting. I've got it sorted by setting the format command to use elm-format instead of waiting for the LS.

# ~/.config/helix/languages.toml
[[language]]
name = "elm"
formatter = { command = "elm-format", args = ["--stdin"] }

My LS experience in helix is significantly better than with emacs, but definitely far from ideal. I too have to restart the LS constantly.

razzeee commented 1 year ago

Doing formatting like that should lead to your cursor getting displaced in some situations.

Let me try to do a release this weekend, there were some interesting sounding fixes upstream

bburdette commented 1 year ago

Just upgraded elm-language-server to 2.7.0, and so far its quite a bit better. The pervasive parsing problems are gone, and so are the problems with the e.l.s. hanging. πŸ‘πŸ‘

pmiddend commented 1 year ago

I still get these "whole file is covered with error(s)" with 2.7.0 and in emacs.

peret commented 10 months ago

Hi,

I'm also running into this issue quite often and would like to offer my help investigating it, if I can. I would say I can reproduce this very consistently in my project(s).

I'm using Helix, elm-language-server v2.7.0, and elm-format v0.8.7 - both packages installed from nixpkgs, in case that matters.

What I've observed so far is that it definitely seems related to formatting, at least for me:

:lsp-restart in Helix consistently fixes the issue, until I make another change and trigger it by formatting again - so it doesn't happen for me on init.

My impression is that this happens more often when there is an actual error present, e.g. I try calling a function that doesn't exist. However, in my testing I have seen it happen when all my changes were valid as well.

@razzeee As I said, I can reproduce this pretty consistently. Is there anything I can do to help figure this issue out? Any hints you can give me for a next step?


Side note

The problem seems to occur even when I configure Helix to use elm-format directly. E.g. my language configuration for Helix contains:

[[language]]
auto-format = false
name = "elm"

[[language.language-servers]]
except-features = ["format"]
name = "elm-language-server"

[language.formatter]
args = ["--stdin"]
command = "elm-format"

What this accomplishes is that Helix will call elm-format directly to auto-format the document, instead of sending a format-request to the LSP. At least I think it does... Not sure whether this is relevant, though.

razzeee commented 10 months ago

I don't know how helix works, but there should be a way to get a log with an edit history. I would think that that history shows a bug.

peret commented 10 months ago

TL;DR: I think there might be a bug in how the language server updates the AST when the client sends multiple contentChanges at once.


I think I might have found something... First of, here's a recording of the change I'm making to reproduce the behavior: https://asciinema.org/a/BQKxoJq52G3NIxH4YOW7Yxavw Basically, I'm inserting the let..in snippet, adding a new assignment and then saving the file. Afterwards, the diagnostics show a lot of bogus parsing errors.

Now, in the helix logs I can see the messages being sent to the LSP and the replies the editor gets. As you can (hopefully) see in the recording, there are two changes being made on format - the assignment being broken up into two lines and the newline between let-expression and function body being removed.

So, the response to the formatting request looks like this:

[
    {
        "newText": "\n           ",
        "range": {
            "end": { "character": 11, "line": 41 },
            "start": { "character": 11, "line": 41 }
        }
    },
    {
        "newText": "",
        "range": {
            "end": { "character": 4, "line": 43 },
            "start": { "character": 6, "line": 42 }
        }
    }
]

In turn, the editor applies theses changes and reports back with a textDocument/didChange message with the following body:

{
    "jsonrpc": "2.0",
    "method": "textDocument/didChange",
    "params": {
        "contentChanges": [
            {
                "range": {
                    "end": { "character": 11, "line": 41 },
                    "start": { "character": 11, "line": 41 }
                },
                "text": "\n           "
            },
            {
                "range": {
                    "end": { "character": 4, "line": 44 },
                    "start": { "character": 6, "line": 43 }
                },
                "text": ""
            }
        ],
        "textDocument": {...}
    }
}

I noticed that for the second change, the line number differs between the formatting response and the next didChange request, so I started wondering about that. This is in line with the spec, however:

The actual content changes. The content changes describe single state changes to the document. So if there are two content changes c1 (at array index 0) and c2 (at array index 1) for a document in state S then c1 moves the document from S to S' and c2 from S' to S''. So c1 is computed on the state S and c2 is computed on the state S'.

Is it possible, that the elm language server does not handle this scenario correctly? I.e. to me it looks like the language server might apply both changes to the same initial state S, and therefore assuming the second edit at a wrong line. This would make a lot of sense to me. It would, for example, explain why it only happens on formatting the document. During normal editing of the document, Helix (and I assume other editors as well) sends each individual key stroke as a didChange notification, so there is never more than one contentChange while editing normally.

I suspect the issue is with this loop, however, I don't know enough about the language server code and tree-sitter to say for sure. What exactly does tree.edit() do for example, and would it behave as intended when called multiple times in this loop?

Could one of the maintainers please confirm, whether my analysis sounds correct? :pray:

peret commented 10 months ago

It would also explain why:

peret commented 10 months ago

Ok, after some further investigation I'm fairly sure I understand what is happening. The issue, I think, is indeed related to the client sending multiple contentChanges at once.

This call to getEditFromChange() gets called with the rootNode's text, which is the same for all the change records. Since the information in later change objects assumes that earlier changes have already been applied to the document, the calculation of the startIndex and endIndex in that function probably isn't correct. If I'm correct, this can lead to the syntax tree getting out-of-sync with the actual document. It's especially easy for me to reproduce when newline changes are involved. At least I think that this is the issue...

I've attempted a (very hacky) solution and it solved the issue for some specific change sets (e.g. the one I was demonstrating earlier), but I could still get the language server into a bad state, so it's not ready yet. It also basically amounted to trying to count the number of newlines being added and removed in previous changes and accounting for that in the calculation in getEditFromChanges(), which is probably not the best solution. I'm also not sure whether only newlines are an issue, or if one has to account for shifts in the character position as well? Ideally maybe we could call getEditFromChanges() with the updated text, but I'm not sure whether that's possible. Might be interesting to find out how other language server implementations handle this case?

I might have some more time to look into this on the weekend, but if anyone who's more knowledgeable about the code base than me wants to take over, that'd be great as well :smiley:

razzeee commented 10 months ago

So I guess the question is if tree.rootNode.text get's updated/flushed in between those loops

In https://github.com/elm-tooling/elm-language-server/blob/main/src/common/providers/astProvider.ts#L112 and https://github.com/elm-tooling/elm-language-server/blob/main/src/common/providers/astProvider.ts#L114

Should be "easy" to print it and figure it out.

I'm however still confused, why this doesn't seem to happen with vscode.

FYI @jmbockhorst

peret commented 10 months ago

Should be "easy" to print it and figure it out.

Yes, sorry for not being clearer, that is exactly what I did during my investigation and it does indeed pass the same text for all invocations of getEditFromChanges() in the loop. So it seems like rootNode.text is not updated by calling edit on the tree (which kinda makes sense, I guess).

As to why it doesn't happen in VSCode, I don't have an answer yet. First of, I thought it was also happening in VSCode? :sweat_smile: My wildest guess would be that VSCode does not group contentChanges in one notification, even if there are multiple changes from the formatter. That would be pretty weird though, and I don't have the means to check that right now.

peret commented 10 months ago

Hey!

I linked a PR that I think will fix the erratic behavior and parsing errors. The code is pretty much lifted from the typescript-language-server.

The idea is basically just to keep track of the changed text for each "intermediate" state ourselves, since rootNode.text , to my knowledge, won't be upaded until parse is called again.

I have tested this only a little and could not reproduce any of the issues that I could before. I'll try to put in a real Elm coding session soon, to test-drive my PR under more realistic conditions :)

Edit: I've been using elm-language-server with my changes for the past two days and haven't had any more issues with bogus parsing errors :D

razzeee commented 9 months ago

Please try the new builds

bburdette commented 9 months ago

Giving this a try today. After about 30 mins of usage no problems, which is definitely out of the ordinary. Did seem to get a bit tweaked on :lsp-restart. But after restart of helix, back to normal and seems to stay that way. Will update if I run in to issues.