Closed syvb closed 11 months ago
I originally described a categorization issue with - turns out the Unicode data files are correct, I was just using outdated ones. Oops. I kept the tests that verify (and the Syriac abbreviation mark) are categorized correctly.
Adds Unicode 15.1 support.
Updating tests
Turns out
scripts/unicode_gen_breaktests.py
was last run for Unicode 11 - every subsequent updater forgot to run it. I updated the GitHub Action that checksscripts/unicode.py
was run to also check forscripts/unicode_gen_breaktests.py
being run.Devanagari mis-segmentation
There are a few cases where Devanagari grapheme segmentation fails after updating the test data from Unicode 11 to Unicode 15. I just skipped those failing tests for now.