unicode-rs / unicode-normalization

Unicode Normalization forms according to UAX#15 rules
https://unicode-rs.github.io/unicode-normalization
Other
160 stars 42 forks source link

Fix `is_public_assigned` to include Hangul Syllable and other ranges. #80

Closed sunfishcode closed 3 years ago

sunfishcode commented 3 years ago

Hangul Syllables and several other ranges are defined in UnicodeData.txt as just their first and last values:

AC00;<Hangul Syllable, First>;Lo;0;L;;;;;N;;;;;
D7A3;<Hangul Syllable, Last>;Lo;0;L;;;;;N;;;;;

Teach the unicode.py script how to recognize these, so that it correctly classifies them as assigned ranges, for the is_public_assigned predicate.