soasme / PeppaPEG

PEG Parser in ANSI C
https://soasme.com/PeppaPEG
MIT License
55 stars 7 forks source link

[Feature Request]: Unicode Range #65

Closed soasme closed 3 years ago

soasme commented 3 years ago

Is your feature request related to a problem? Please describe.

Some programming languages, such as Go, allows unicode letters in identifiers. Having a complete set of unicode letters in Peppa PEG grammar is tedious and low performant.

Describe the solution you'd like

Extend range to support [\p{L}] in addition to current implementation: [a-f] / [\u{1}-\u{10ffff}]. Since \p itself has already the meaning of "range", there is no need to add a dash specifying the lower and upper.

More "\p" examples can be seen in Regex: https://www.regular-expressions.info/unicode.html, https://www.compart.com/en/unicode/category.

Describe alternatives you've considered

An alternative way is to provide built-in characters like in pest: UNICODE_LETTERS.

Additional context

N/A