Closed suliveevil closed 4 years ago
I'm not sure what you mean. Can you provide an example and what you expect to be highlighted based on your cursor position.
Treat character that in range of [\u4e00-\u9fa5] as a word. Then we can achieve
Highlight all the same single character in range
[\u4e00-\u9fa5]
Chinese is different from English because they don't have delimiters between words.
So we can treat the tiniest textobj of Chinese i.e. character in range [\u4e00-\u9fa5]
as a word.
Then we can highlight the same pseudo word
as usual.
If we have a this match pattern , we can even do more on it: use text segmentation to highlight other textobj.
use text segmentation to highlight other textobj.
What does this mean?
As for matching specific unicode characters as whole words, vim-illuminate uses \k
which is controlled by :h 'iskeyword'
which doesn't support unicode characters AFAIK. vim-illuminate loosely highlights the same thing matched by the motion iw
(try viw
on your text and you will see what it highlights). To add support for matching specific unicode characters I would likely add an option to match against a regex pattern along the lines of [a-z]
but with unicode characters. However, I'm not sure about adding this, I'm going to think about this for a couple of days.
Text segmentation should be done by tools like jieba or other natural language grammar checkers.
Text segmentation should be done by tools like jieba or other natural language grammar checkers.
For the record I was confused by textobj
(not the first part of your previous sentence) because it means something very different than simply the tokens in a document, for example vim-illuminate to an extent highlights the text object iw
, while there are no text objects for single unicode characters.
You are right. Thank you very much. I realized that I should make a custom textobj to cooperate with vim-illuminate, that's Vim way. I will learn to write plugins.
Would you please add mbyte character support? Highlight all the same single character in range
[\u4e00-\u9fa5]
is all I need. Thank you very much.