Closed kuanyui closed 10 years ago
Same issue with öüäß and other diacritics. I don't exactly know what the general fix should be like (e.g. should it be a single space between every non- and chinese character?), but using (category latin)
instead of (in "[a-zA-Z0-9]")
in the regex definitions seems reasonable if you're using the latin alphabet, i.e. https://github.com/Ferada/pangu-spacing/commit/4a140aa23a6b056acbcfe967c458f66412dea45a, also (category chinese-two-byte)
, because at least á is included in the chinese character class, but I'm assuming this isn't particularly helpful for this mode.
I think the general fix is to make a single space between every non- and Chinese characters. Using (category latin)
is resonable. However, after I use (category latin)
to replace (in "[a-zA-Z0-9]")
in the regex, it seems like if space already between chinese and non- charaters, pangu-spacing-mode will still add a dulpicate space between them.
I'll find the solution and fix this issus. Thanks :)
Using (category latin)
may also make pangu-spacing-mode use more time to generate the virtual space
, I'll fix all these up then update this mode.
At last, I only use (category chinese-two-byte)
to prevent this issue, since use (category latin)
will cause this mode use more time to parse the buffer, which are not acceptable.
Some alphabets with accent are mistakenly treated. For example,
Frédéric Chopin
will be converted toFr é d é ric Chopin