unicode-org / unicodetools

home of unicodetools and https://util.unicode.org JSPs
https://util.unicode.org
Other
52 stars 41 forks source link

SegmenterDefault.txt: more remapping, less renaming #970

Closed eggrobin closed 2 days ago

eggrobin commented 1 week ago

Follow-up on https://github.com/unicode-org/unicodetools/pull/949, using remap rules wherever the UAXes do, and dropping now-useless names such as ZWJ_O, CM1, etc.

This technically removes all of the examples cited in UTC-155-A89 (as worded in SD2 « Document extra classes used for testing characters in the segmentation test HTML files for 11.0. [E.g. ZWJ_FE, CM1_CM, etc.] (Retargeted for 13.0, 14.0, 15.0.) », not as recorded in the minutes), but that action item should remain open until all nontrivial variable definitions are shown in the generated HTML files.

The MeowBreakTest files are hard to diff, but I have tried them in ICU: they are still correct.