Use icu.UnicodeSet instead of custom code in tools/readers.py parse_unicode_set

rosettatype / hyperglot

Hyperglot: a database and tools for detecting language support in fonts

http://hyperglot.rosettatype.com

GNU General Public License v3.0

162 stars 22 forks source link

Use icu.UnicodeSet instead of custom code in tools/readers.py parse_unicode_set #54

Closed twardoch closed 5 months ago

twardoch commented 3 years ago

If you add PyICU to requirements, then you can rewrite parse_unicode_set from tools/readers.py trivially:

import icu
def parse_unicode_set(s):
  return sorted(list(icu.UnicodeSet(s)))

kontur commented 3 years ago

Super handy, thanks! A lot of the tools/ scripts were for initial data aggregation but should we re-use some of that parsing I'll update. The signature of the two is a bit different and I think paths have changed since, so... yea :)

But using icu already in the comparison code for CLDR/SLDR, so thanks for the pointer 👍

Leaving this open as reminder.