ocaml-community / sedlex

An OCaml lexer generator for Unicode
MIT License
239 stars 43 forks source link

Add support for unicode 16.0.0. #157

Closed toots closed 2 weeks ago

toots commented 1 month ago

This PR adds support for unicode 16.0.0

Notes:

There is quite a bit of noise due to some required module renaming to make the new old unicode ml file compile in the regression tests.

The only change otherwise is the addition, and use, of a to_list noop API to convert private Cset back into lists.

Otherwise, this is a fairly straight forward update.

pmetzger commented 1 month ago

I haven't looked at Unicode 16; what are the non-binary properties for?

toots commented 1 month ago

I haven't looked at Unicode 16; what are the non-binary properties for?

I think that this documents it: https://www.gnu.org/software/libunistring/manual/html_node/Indic-conjunct-break.html

toots commented 2 weeks ago

Ok I have cleaned up this PR and minimized the diff. I think that it is ready for merge now.

pmetzger commented 2 weeks ago

There seem to be conflicts that would prevent a merge?

toots commented 2 weeks ago

There seem to be conflicts that would prevent a merge?

You mean git conflicts?

toots commented 2 weeks ago

Thanks, just merged!

I think we should consider a release soon.