Closed suhr closed 5 years ago
In rust-analyzer, we maintain a separate index to translate utf8-offsets into (invalid) utf-16 line/column as per LSP:
https://github.com/rust-analyzer/rust-analyzer/blob/fcdb387f0d7e76f325a858e4463efd5d7ed3efc3/crates/ra_ide_api/src/line_index.rs https://github.com/rust-analyzer/rust-analyzer/blob/fcdb387f0d7e76f325a858e4463efd5d7ed3efc3/crates/ra_ide_api/src/db.rs#L61-L68
A separate index sounds somewhat inconvenient. By the way, why UTF-16?
By the way, why UTF-16?
LSP requires UTF-16
This is actually one of the main reasons why a separate index makes sense: there's no universal definition of line/column
: for some editors it is UTF-16 codepoints (VS Code), for some it is Unicode Characters (Emacs), and I bet for others it could be grapheme clusters as well.
It seems like
SyntaxNode
only hasTextRange
which contains only infomation about absolute offsets. But how do you handle line/column ranges (necessary for printing errors)?