Closed wezm closed 2 years ago
I'm not very comfortable to moving the specifics of the generation logic out of tree. It would make sense to have a shared library that can parse unicode files and generate various kinds of tables (indeed, the various python scripts used by unicode crates are largely the same code with some differences) which each crate uses with some tailoring, but I'm not sure it makes sense to have the entire generation out of tree.
There are also a lot of unicode crates using generation scripts like this and we should come up with a unified plan for them. Some do a bunch more work than others, for example unicode-script has a lot of custom logic it uses for generating the scriptextensions mappings.
That said I'm kinda okay in this specific case (though we should use upstream) since the bidi tables are quite straightforward.
though we should use upstream
My earlier PR to ucd-generate has remained open without comment for more than a month so I decided to move forward with our forked version for now.
For now perhaps we can just land an update to 12.1 while we figure out the right strategy for all the other crates to use?
:umbrella: The latest upstream changes (presumably #56) made this pull request unmergeable. Please resolve the merge conflicts.
This PR replaces
unicode.py
generated tables with tables generated by the Rust toolucd-generate
and updates the generated tables to Unicode 12.1.Specifically it uses our (YesLogic) fork of
ucd-generate
which we've extended with support for generatingbidi-class
tables.