ianh / owl

A parser generator for visibly pushdown languages.
MIT License
746 stars 21 forks source link

Feature request: Unicode properties #35

Open data-man opened 1 year ago

data-man commented 1 year ago

Owl is awesome, thank you!

My proposals:

What do you think?

ianh commented 1 year ago

Something like this would be possible, but at the moment, every token can be separated by whitespace. For example, if you had a rule like ident = property(ID_Start) property(ID_Continue)*, identifiers would include things like abc but also a b c d. The best way to make custom identifiers right now is via user-defined tokens, which involves writing a bit of code in a C function and passing it to the generated parser to use during tokenization.