microsoft / language-server-protocol

Defines a common protocol for language servers.
https://microsoft.github.io/language-server-protocol/
Creative Commons Attribution 4.0 International
11.27k stars 799 forks source link

Document expectations around filename casing #1263

Open DanTup opened 3 years ago

DanTup commented 3 years ago

There have been many bugs relating to casing of filenames/paths in VS Code and language extensions in the past, so I think it's important that LSP defines expectations to ensure clients and servers have the same expectations so they work together correctly.

VS Code no longer uses the filesystem casing of a file in its APIs if the file is renamed, which means servers must be prepared to get different (the old) casing from VS Code after a file is renamed (see https://github.com/microsoft/vscode/issues/121106).

For example:

// File is open and named DANNY.dart
==> {"jsonrpc":"2.0","id":3,"method":"textDocument/documentSymbol","params":{"textDocument":{"uri":"file:///Users/danny/Desktop/casting_testing/bin/DANNY.dart"}}}

// File is renamed by user to danny.dart
==> {"jsonrpc":"2.0","id":4,"method":"workspace/willRenameFiles","params":{"files":[{"oldUri":"file:///Users/danny/Desktop/casting_testing/bin/DANNY.dart","newUri":"file:///Users/danny/Desktop/casting_testing/bin/danny.dart"}]}}

// VS Code/LSP continues to use the original casing
==> {"jsonrpc":"2.0","id":5,"method":"textDocument/documentSymbol","params":{"textDocument":{"uri":"file:///Users/danny/Desktop/casting_testing/bin/DANNY.dart"}}}

Many tools treat file paths case-sensitively even when the underlying filesystem is case-insensitive. In some cases (for example virtual workspaces) it might not even be possible to tell whether the underlying filesystem is case-sensitive, so it seems like servers might be forced to treat things all as case-insensitive.

It seems like the spec should have some guidelines/notes on this since it might not otherwise be obvious to implementers what they should do here (and depending on the environments they developer/test in, they may not see the same behaviour as other users).

DanTup commented 3 years ago

Another example that I think should be clear from the spec:

Related: If a server sends diagnostics with lowercase, then again with uppercase, should the client consider them to replace the original ones, or must a server always stick to a single casing?

I feel these are important questions that are currently not well answered by VS Code or LSP, and that results in servers (and clients) having different expectations, and it will be difficult to resolve properly without some guidelines.

dbaeumer commented 3 years ago

@DanTup have you filed an issue against VS Code to understand why VS Code itself still refers to the file using the old casing after a rename?

DanTup commented 3 years ago

@dbaeumer yep, that's the issue linked above (https://github.com/microsoft/vscode/issues/121106). It's apparently by design and there are no plans to change (though I'm not entirely convinced it's a good idea - it feels like VS Code implementation details leaking into extensions and language servers, which may also mean other editors may have to copy its behaviour).

dbaeumer commented 3 years ago

OK. Thanks overlooked the reference.

dbaeumer commented 3 years ago

In general servers need to be aware of whether a file system they are working on is case sensitive or case insensitive and need to be prepared for case insensitive URIs to handle casing correctly (e.g. a.txt is the same than A.txt). Forcing clients to always use real names as on disk is IMO not the right direction either. However I do agree that clients should have access to names like they are presented in the UI. I will add a comment to the VS Code issue.

dbaeumer commented 3 years ago

In ESLint I had cases for this as well. In cases where it was necessary I query for the real name before calling into ESLint libraries.

DanTup commented 3 years ago

But won't servers sometimes not have access to the underlying file system (for example with virtual workspaces)?

Still, if it's the expectation, I think it should be explicitly noted in the docs (even if it's just some general guidelines). For a long time, VS Code would get confused if a server sent it a URI that wasn't cased the same as VS Code expected (for example it might open multiple copies of the same document) - although that seems to be fixed (albeit at the expensive of just pushing some of the issues to the server), I'm sure other editors don't all behave the same. It shouldn't be necessary unpick VS Code's behaviour in order to implement a compliant editor or server.

DanTup commented 3 years ago

@dbaeumer would you accept a PR with some guidance along these lines?

URI Case-Sensitivity

Both clients and servers are expected to consider filesystem casing-sensitivity when handling requests. For case-insensitive file systems, both client and server may use URIs that are different in casing to both the underlying resource on the filesystem and the casing used by the other party. If a resource is renamed, clients or servers may continue to use the original casing (and may continue to do so even after restarting the application).

If the underlying filesystem is not available (for example the resource is a non-file:// URI), URIs should always be treated case-sensitively.

DanTup commented 3 years ago

While doing some testing in VS Code, I found that it doesn't support case-sensitivity on macOS (https://github.com/microsoft/vscode/issues/123660) and it doesn't seem it's planned.

This means (assuming LSP will go by VS Codes rules) the above text is not accurate. I don't know what the actual rules are (do we use the filesystem sensitivity except on macOS where we always assume insensitive? does Linux behave as expected, or does that also always behave one way?) and really think they need to be written down somewhere.

Edit: Clarification in https://github.com/microsoft/vscode/issues/123660. The rules are simple - Windows and Mac are always treated as insensitive, and Linux always as sensitive. I don't know whether it makes sense for LSP to copy that or not 😬

DanTup commented 3 years ago

I found another wrinkle in VS Code's implementation (related to https://github.com/microsoft/language-server-protocol/issues/1203). It fires the rename event when a file is renamed by only casing, but does not fire didClose/didOpen. https://github.com/microsoft/language-server-protocol/issues/1203 suggests that those two things should go together, so it's not clear if VS code's behaviour is incorrect, or if this is an exception to that.

DavyLandman commented 2 months ago

Edit: Clarification in microsoft/vscode#123660. The rules are simple - Windows and Mac are always treated as insensitive, and Linux always as sensitive. I don't know whether it makes sense for LSP to copy that or not 😬

To add to this, even on windows an extension can register a virtual file system, and mark it as case-sensitive (or vice-versa on linux). So I do not know how to deal with this correctly, especially since we're not supposed to touch URIs that are "open", only use LSP to work with those files. Not to mention virtual file systems you cannot even try an detect it with some heuristic.

(for example, a hacky way to detect case-insensitivity by doing a stat on the same uri after fully uppercassing the path and lowercasing the path)