This PR adds unicode script extension tables and functions in addition to the script tables that we already have. This is one missing feature for the full unicode escape support.
In addition I have refactored the script table generation to use the ucd-parse crate so we do not have to parse the unicode table files ourselves. We take the output from ucd-parse and convert the codepoints into our representation. This also removed the static scripts list, which makes it easier to regenerate the script tables for new unicode versions.
If this approach looks good, I would refactor the rest of the unicode table generation to use ucd-parse aswell @ridiculousfish.
This PR adds unicode script extension tables and functions in addition to the script tables that we already have. This is one missing feature for the full unicode escape support.
In addition I have refactored the script table generation to use the
ucd-parse
crate so we do not have to parse the unicode table files ourselves. We take the output fromucd-parse
and convert the codepoints into our representation. This also removed the static scripts list, which makes it easier to regenerate the script tables for new unicode versions.If this approach looks good, I would refactor the rest of the unicode table generation to use
ucd-parse
aswell @ridiculousfish.