Open DonaldTsang opened 5 years ago
This repository is used by IRG (Ideographic Rapporteur Group) to reduce possibility of encoding existing variants. The main target of this dataset is for fuzzy matching. The dataset is covering all encoded CJK Ideographs, which means URO - Extension F (80,000+ characters).
The aim and coverage is different from that of chaizi, and the principles and targets for decomposing characters are different. Cross-check will probably not yield substantial benefit to the processes of IRG.
@hfhchan in that case maybe have a footnote about other "chinese decomposition libraries" and how they are different from CJKVI?
Is it possible to do a comparison with https://github.com/kfcd/chaizi ? Or add a note in the ReadME?