yhirose / cpp-peglib

A single file C++ header-only PEG (Parsing Expression Grammars) library
MIT License
880 stars 112 forks source link

"\w" and "\d" character class shortcuts #249

Closed kfsone closed 1 year ago

kfsone commented 1 year ago

Would it be possible to add the regex escapes '\w' and '\d' to the character sequences, and optionally their inverses?

\w <- [a-zA-Z0-9] \W <- ![a-zA-Z0-9] \d <- [0-9] \D <- ![0-9]

https://stackoverflow.com/questions/1576789/in-regex-what-does-w-mean

yhirose commented 1 year ago

Good suggestion!

Andos commented 1 year ago

This would be a really nice feature.

There is also the unicode classes to consider: \p{L} Any unicode letter \p{Z} Any kind of whitespace or invisible separator. \p{N} Any kind of numeric character in any script. \p{P} Any kind of punctuation character.

See more here: https://www.regular-expressions.info/unicode.html#category

yhirose commented 1 year ago

Merged to #87.