Hejsil / mecha

A parser combinator library for Zig
MIT License
453 stars 20 forks source link

WIP & PoC: Unicode matchers #17

Closed data-man closed 3 years ago

data-man commented 3 years ago

The data is taken from the Unicode 14.0.0 beta files. I prefer everything the latest. :smile:

ToDo:

data-man commented 3 years ago

@Hejsil I'm waiting for some feedback from you. :(

data-man commented 3 years ago

:sob:

Hejsil commented 3 years ago

Ooh, sorry. I think I saw the PR when it was first opened and concluded it wasn't done enough to give a real look. Didn't see your comment 3 days ago. I'm very sorry.

It is quite a lot of code to maintain and would need to be updated to newer unicode standards when they come out. That is not really something I wonna take on. Also, what is stopping us from having the unicode stuff in some seperate library, and then people can hook it up to mecha themselfs?

I think it is worthwhile to have the utf8.umatcher as an api though (maybe with a different name though). This function allows an external library to easily hook up to mecha:

const basic_latin_parser = mecha.utf8.matches(fn (c: u21) bool { return unicode_lib.block(c) == .BlockLatin });