rust-lang / rls

Repository for the Rust Language Server (aka RLS)
Other
3.51k stars 257 forks source link

Make sure we pass UTF-16 code unit offsets in all the LSP types #1113

Open Xanewok opened 5 years ago

Xanewok commented 5 years ago

cc https://github.com/rust-lang-nursery/rls/pull/1112 cc https://github.com/Microsoft/language-server-protocol/issues/376

This causes problems with displaying correct diagnostic span and code suggestion spans (here).

Xanewok commented 5 years ago

Currently LSP specifies all the text offset to use the UTF-16 code unit ("Text Documents section in the LSP specification) and so that's what Range type is expected to pass.

However, RLS uses its own rls_span::Range (from rls-span crate, used both by the rustc and rls), which has text unit offset specified as the unicode scalar values (think Rust char and chars()), which we naively transform to Range using rls_to_range: (bad!) https://github.com/rust-lang/rls/blob/816017b91b7bb36343f50cbf9d803b8d7970f43c/src/lsp_data.rs#L130

For lines it doesn't matter, but we should only be able to make the UTF-16 code units <> Unicode scalar value offset conversion given a source line that the range operates on.

It might make sense to create a method on the VFS (https://github.com/rust-dev-tools/rls-vfs) to convert between given spans or columns.

See https://github.com/rust-lang/rls/pull/1112 and https://github.com/rust-dev-tools/rls-vfs/pull/24 for related changes

lijinpei commented 5 years ago

Maybe we should ignore this problem, and wait (or make ?) M$ to change that to utf-8?

Xanewok commented 5 years ago

The earliest they could do it is in LSP 4.0 and I’m not sure they even plan on doing so, so I’d say we should still do it ourselves.

On Wed, 13 Feb 2019 at 08:06, lijinpei notifications@github.com wrote:

Maybe we should ignore this problem, and wait (or make ?) M$ to change that to utf-8?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/rust-lang/rls/issues/1113#issuecomment-463084244, or mute the thread https://github.com/notifications/unsubscribe-auth/AC8y3Zl7bDiW5IqbioP15auy1HSVSD8Iks5vM7l_gaJpZM4YMx9d .

mawww commented 5 years ago

If enough client/servers disregard the spec and unify on a sane alternative (byte or codepoint count), VSCode and the spec will eventually adapt. I suspect most tools use byte or codepoint counts until an issue gets opened due to a strange interaction with another lsp tool, at which point somebody reads the spec, re-reads it again, and goes through the various stages of grief...

Microsoft has control of the spec, but we, as tools writers, have no obligation to follow it to the letter, provided we unify on alternative behaviours and make it known.

soc commented 5 years ago

@mawww This is exactly what I intend to do for the (non-Rust) LSP implementation I'm planning to write in the coming months.