Support non-latin characters

Saghen / blink.cmp

Performant, batteries-included completion plugin for Neovim

MIT License

1.44k stars 80 forks source link

I found it that changing REGEX in lua/blink/cmp/fuzz/lib.rs to r"\w{2,32}" seems to fix completions for non-ascii words. in the buffer. Originally, the get_words function would split aaaaaa aaaaáaaa into {'aaaaaa', 'aaaa', aaa} (skipping over á), now it splits it as {'aaaaaa', 'aaaaáaaa'}.

Is there a reason why this approach would not work? (Is it too expensive to use unicode aware regexes?)

EDIT: r"[\w-]{2,32}" works better if you want to complete things with hyphens too (turns out I am working on a spanish language typst document where a bunch of terms include hyphens, so that was needed).

Saghen / blink.cmp

Support non-latin characters #130