servo / unicode-bidi

Implementation of the Unicode Bidirection Algorithm in Rust
Other
74 stars 34 forks source link

Update to Unicode 12.1 with ucd-generate #53

Closed wezm closed 1 year ago

wezm commented 4 years ago

This PR replaces unicode.py generated tables with tables generated by the Rust tool ucd-generate and updates the generated tables to Unicode 12.1.

Specifically it uses our (YesLogic) fork of ucd-generate which we've extended with support for generating bidi-class tables.

Manishearth commented 4 years ago

I'm not very comfortable to moving the specifics of the generation logic out of tree. It would make sense to have a shared library that can parse unicode files and generate various kinds of tables (indeed, the various python scripts used by unicode crates are largely the same code with some differences) which each crate uses with some tailoring, but I'm not sure it makes sense to have the entire generation out of tree.

Manishearth commented 4 years ago

There are also a lot of unicode crates using generation scripts like this and we should come up with a unified plan for them. Some do a bunch more work than others, for example unicode-script has a lot of custom logic it uses for generating the scriptextensions mappings.

Manishearth commented 4 years ago

That said I'm kinda okay in this specific case (though we should use upstream) since the bidi tables are quite straightforward.

wezm commented 4 years ago

though we should use upstream

My earlier PR to ucd-generate has remained open without comment for more than a month so I decided to move forward with our forked version for now.

Manishearth commented 4 years ago

For now perhaps we can just land an update to 12.1 while we figure out the right strategy for all the other crates to use?

bors-servo commented 3 years ago

:umbrella: The latest upstream changes (presumably #56) made this pull request unmergeable. Please resolve the merge conflicts.