microsoft / language-server-protocol

Defines a common protocol for language servers.
https://microsoft.github.io/language-server-protocol/
Creative Commons Attribution 4.0 International
11.03k stars 767 forks source link

Language Server Index Format: use a non-UTF-16 encoding as the default for `positionEncoding` #872

Open MaskRay opened 4 years ago

MaskRay commented 4 years ago

From the lengthy discussions in https://github.com/microsoft/language-server-protocol/issues/376 , a lot of language servers and non-JavaScript based clients have been inconvenienced due to the UTF-16 position counting. In Language Server Index Format, which does not have the backward compatibility burden, we should probably move away from UTF-16 which is considered inferior by a majority of developers. As the recommended default, both UTF-32 code points and UTF-8 bytes can work but it seems that code points are convenient for most implementations.

When opening a non UTF-8 encoded file, an implementation may likely represent it with a sequence of code points. I believe most implementations have convenient APIs to count code points but counting UTF-8 bytes may require more transcoding.

  /**
   * The string encoding used to compute line and character values in
   * positions and ranges. Currently only 'utf-16' is support due to the
   * limitations in LSP.
   */
  positionEncoding: 'utf-16',

A survey on the Position.character offsets encoding supported by the clients:

Language client Counting method of Position.character offsets  
eglot UTF-32 codepoints UTF-32 by default, optionally UTF-16
LanguageServer-neovim UTF-32 codepoints  
lsp-mode UTF-32 codepoints  
vim-lsp UTF-32 codepoints  
ycm UTF-16 code units Would prefer UTF-8, but only if standardized
VSCode UTF-16 code units  
dbaeumer commented 4 years ago

As the field positionEncoding indicates this will be something that is customizable in LSIF (supporting more encodings than utf-16 only).

dbaeumer commented 3 years ago

Since this has nothing to do with the upcoming 3.16 LSP spec I move it to on deck.