This is my attempt at adding a couple of tests for this secton of code.
Apologies and Disclaimer
However, I am not a native Chinese, Japanese, Korean or Vietnamese speaker, and this was a best guess based on the official unicode table and the Cabridge English <-> Chinese (Simplified) dictionary, so if this is not right for any reason, please feel free to edit the PR and/or feedback here please! The aim here is to avoid any regression from getting introduced in the future.
Question
I also noticed that the unicode tables go all the way to \u9fff for CJKV characters/ideographs. Should we expand the scope of the chineseRegex to match this?
Problem
I noticed while testing #72 that this piece of code:
was not actually tested anywhere.
Solution
This is my attempt at adding a couple of tests for this secton of code.
Apologies and Disclaimer
However, I am not a native Chinese, Japanese, Korean or Vietnamese speaker, and this was a best guess based on the official unicode table and the Cabridge English <-> Chinese (Simplified) dictionary, so if this is not right for any reason, please feel free to edit the PR and/or feedback here please! The aim here is to avoid any regression from getting introduced in the future.
Question
I also noticed that the unicode tables go all the way to
\u9fff
for CJKV characters/ideographs. Should we expand the scope of thechineseRegex
to match this?