chinese-words-separator / chinese-words-separator.github.io

5 stars 1 forks source link

Wrong conversion (Simplified to Traditional characters) #4

Closed Drazduf closed 1 year ago

Drazduf commented 1 year ago

CWS misconverted 「法」 as 「灋」 when treated as separate character. CWS wrong conversion

chinese-words-separator commented 1 year ago

Hmm.. I put guard into places to prevent over-conversion, like when converting from simplified to traditional, the traditional/simplified 后 (queen/back) should not convert to traditional 後 (back), as traditional queen 后 is still 后 on simplified version, and the traditional back 後 is mapped to 后 when China simplified 後

On websites with simplified content, 后 is ambiguous if it is a stand-alone character (especially on websites that uses stand-alone characters such as SRS websites) or when there are no surrounding context to a character. Hence the stand-alone 后 should not convert to 後 when converting from simplified to traditional, which CWS implemented correctly. Sadly, Google Translate converts 后 to 後 when converting from simplified to traditional; Google Translate should not do that, standalone 后 is ambiguous if a website uses simplified characters, I digress.

image image

As for the 法 being converted to 灋, looks like both you and Google Translate are correct. Characters should not be converted if the target conversion just maps to a variant

Thanks for reporting this

chinese-words-separator commented 1 year ago

Corrected now, will release it on version 8.24.84.701

Google need to process the pending version (8.24.84.700) before it accepts publishing another

image