Fix conversion of char positions to UTF-16 code units

lexical-lsp / lexical

Lexical is a next-generation elixir language server

779 stars 77 forks source link

We were previously using CodeUnit.to_utf16, which converts a code unit in a UTF-8 string to a code unit in the same string encoded as UTF-16, but we were passing in a character position which would lead to incorrect conversion of positions if any multi-byte characters were present before the position.

There were two tests of this existing behavior that seem to have been testing the incorrect behavior.

I'm marking this PR as draft for now because I'm also going to delete some CodeUnit conversion code that we should never be using (like CodeUnit.to_utf16) in a separate commit.

lexical-lsp / lexical

Fix conversion of char positions to UTF-16 code units #719