jackguo380 / vim-lsp-cxx-highlight

Vim plugin for C/C++/ObjC semantic highlighting using cquery, ccls, or clangd
MIT License
338 stars 26 forks source link

Chinese characters will cause incorrect highlighting #51

Open chrisniael opened 3 years ago

chrisniael commented 3 years ago

Describe the bug If there is a Chinese character string, the highlighting of this line of code will be incorrect.

See line 7 of the screenshot for details

To Reproduce

#include <iostream>

class A {
 public:
  void dump() const {
    std::cout << "value=" << this->n_ << std::endl;
    std::cout << "数值=" << this->n_ << std::endl;
  }

 private:
  int n_;
};

int main() { A a; }

Expected behavior

Line 6 of the screenshot is what I expected.

Screenshots

image

Configuration (Fill this out):

Log File:

Wed 13 Jan 2021 02:55:10 PM CST: lsp_cxx_hl beginning initialization...
Wed 13 Jan 2021 02:55:10 PM CST: vim-lsp not detected
Wed 13 Jan 2021 02:55:10 PM CST: LanguageClient-neovim not detected
Wed 13 Jan 2021 02:55:10 PM CST: coc.nvim successfully registered
Wed 13 Jan 2021 02:55:10 PM CST: nvim-lsp not detected
Wed 13 Jan 2021 02:55:14 PM CST: textprop nvim notify symbols for main.cpp
Wed 13 Jan 2021 02:55:14 PM CST: hl_symbols (textprop nvim) highlighted 16 symbols in file main.cpp
Wed 13 Jan 2021 02:55:14 PM CST: operation hl_symbols (textprop nvim) main.cpp took   0.004213s to complete
jackguo380 commented 3 years ago

Hi, I have reproduced the bug and unfortunately I'm not really sure I can solve it in vim-lsp-cxx-highlight.

The problem is that the column positions sent are based on character offsets but are then inconsistent with vim column numbers. If you hover over the first chinese character you'll see it starts at column 19 and then the second character starts at column 21 then the = starts at column 23. As a result this offsets the highlighting for everything on the same line after those characters.

This looks to be caused by the multi_byte (:h mbyte.txt) allowing wide characters to take up 2 columns. It could be something to do with vim trying to be consistent with how the terminal renders characters. Maybe there's a setting to change that but after digging through the help page for mbyte I can't figure out how to change it.

Both vim and nvim's highlight APIs use byte based positions which makes it not feasible to fix it with code since theres no efficient way of converting the character position to byte position. The only thing I could think of is scanning the line and figuring it out from that, but it would be very slow and complicated to do in vimscript.

I think the best option is try to figure out the multi byte settings in nvim, maybe there's channels where people might know more about this, or you can open a issue on vim/nvim to ask. Sorry that I'm not able to help with this.

As a side note, I can also reproduce a similar problem with LanguageClient's error highlighting: image

Kamilcuk commented 3 years ago

The only thing I could think of is scanning the line and figuring it out from that, but it would be very slow and complicated to do in vimscript.

Ok, so I did some research, read this https://github.com/neovim/neovim/issues/6161, and then I just tried this:

--- i/autoload/lsp_cxx_hl/textprop_nvim.vim
+++ w/autoload/lsp_cxx_hl/textprop_nvim.vim
@@ -19,8 +19,12 @@ function! s:buf_add_hl(buf, ns_id, hl_group,
     " single line symbol
     if a:s_line == a:e_line
         if a:e_char - a:s_char > 0
+                      let line = getbufline(a:buf, a:s_line + 1)[0]
             call nvim_buf_add_highlight(a:buf, a:ns_id, a:hl_group,
-                        \ a:s_line, a:s_char, a:e_char)
+                        \ a:s_line,
+                        \ byteidx(line, a:s_char),
+                        \ byteidx(line, a:e_char))
             return
         else
             return

And it works. Works just fine. Please kindly @jackguo380 assist me, where in the code base should I put this transformation, so that also vim part is affected and other nvim_buf_add_hightlight calls are affected to? I think my mind tells me to put it close to the source when it's the offsets are received, but maybe that's not the best position. If I will put something along map(a:symbols, transform_iffsets) inside lsp_cxx_hl#hl#notify_symbols will that be fine?

jackguo380 commented 3 years ago

Hey @Kamilcuk

That's great that you found a potential solution. Could you open a PR with the code so I can review it?

Just some suggestions:

Thanks for looking into this.