DavyLandman commented 5 months ago

During debugging of #354 I've discovered we implemented LSP incorrectly. For keeping it simple, I'll focus on the Parametric implementation, the Rascal implementation has roughly the same problems, but will require a different solution I suspect.

Specification vs current implementation

I'll be quoting form the LSP specification 3.17 a bit.

Text Document Synchronization ... The document open notification is sent from the client to the server to signal newly opened text documents. The document’s content is now managed by the client and the server must not try to read the document’s content using the document’s Uri. Open in this sense means it is managed by the client. It doesn’t necessarily mean that its content is presented in an editor.

Currently our implementation starts to keep track of the document contents after an open event. Request that target this current document will use that state, and it's parse tree to provide callbacks to rascal lsp. So for the document in question, we respect the rules, we are not reading from disk.

However:

exists & lastModified and other functions in IO will go to disk, and potentially correspond to the wrong "state" of the documents. As a change in LSP does not have to mean a change visible on disk.
Other files that are read (references to other modules) are always read from disk, even though LSP clearly state that if those documents are also tracked by LSP, we are not supposed to read them from disk ourselves. We've seen this cause issues since we cannot take into account unsaved changes.
If not careful, usage of the current rascal-lsp can cause strange caching issues (the mismatch between lastModified and the unsaved state in the editor is a known example).

DidOpenTextDocument .. Note that a server’s ability to fulfill requests is independent of whether a text document is open or closed.

Currently if VS Code asks us information about a document that is closed (or never opened before) we throw an exception. Even if we've seen the document before. The correct behavior is that is non-opened, we are allowed to read the file from disk, and answer the question that is asked.

VersionedTextDocumentIdentifier An identifier to denote a specific version of a text document. This information usually flows from the client to the server.

We are currently not tracking the version number of a text document. If done well, we can relate diagnostics to certain version of a text document, so VS Code knows when to clean them, or update their location.

Solution proposed

I think we have to:

Stop making a difference between saved and unsaved changes (since we do not know)
Either:
1. Provide the LSP contributions with an interface to request parse trees and modification timestamps that reads it from the internal state of rascal-lsp
2. Rewrite IO request that target files currently tracked by rascal-lsp (aka, claimed by the lsp client) and answer them based on the state of rascal-lsp
Support requests for not-opened URIs
Track version of documents, and report diagnostics relative to these versions

This is not a small change, but it does solve some of the subtle issues we've been seeing.

Also, for a rascal-core based typechecker, this might become a bit more messy ("disconnecting" it's IO operations), so let's discuss that in a different issue.

jurgenvinju commented 5 months ago

We can also register a (set of) logical URI providers that catch all files that are handled by the LSP server and default to the existing handlers when not handled by us. Open and close can make sure we register and de-register the handlers.

By doing so the rascal code that uses the library builtins can remain oblivious to this strange situation. Btw the eclipse IResource mechanism also worked in such a manner; a file system abstraction with in memory caching.

DavyLandman commented 5 months ago

I like the beauty of that idea, but that would mean we have to do #129 first, to make sure the VS Code itself is not seeing those emulated files.

jurgenvinju commented 5 months ago

Ok let's think this through in a meeting. I think we need to clarify some things first.

Probably if we want to have this facade and make the difference when we need to, we have to use something else than the logical resolvers. The reason is that logical resolvers always map to another URI. That's necessary but it's not sufficient. Here we also need to implement different streams. So at least there will be a real URI resolver for files in the LSP space. Like LSP+project:// and a rewriter/logical resolver to forward to that new scheme. And finally we need to be able you selectively rewrite and not always. Perhaps #129 is the answer to that, as you suggested.

129 can only work if we get rid of the lib scheme. That's because the lib scheme URI are not stable between JVM instances.

DavyLandman commented 5 months ago

the lib schema is already broken over that bridge. since you would get the lib schema is interpreteed by the rascal lsp jvm, not the REPL (which has some chance of working).

But agreed, let's have a meeting and hash it out.

DavyLandman commented 4 months ago

Part of this is solved in #357, but the whole IO operations shouldn't look on disk is still wide open.

usethesource / rascal-language-servers

Fix rascal-lsp incorrect implementation of the LSP Document synchronisation for DSLs #355

Specification vs current implementation

Solution proposed

129 can only work if we get rid of the lib scheme. That's because the lib scheme URI are not stable between JVM instances.