yhirose / go-peg

Yet another PEG (Parsing Expression Grammars) parser generator for Go
MIT License
63 stars 8 forks source link

Unicode Regular Expression support #9

Open yhirose opened 5 years ago

yhirose commented 5 years ago

https://github.com/yhirose/go-peg/issues/6#issuecomment-466736815

Plus another question, is there a way to define a Unicode string in rules? eg:

https://stackoverflow.com/questions/30482793/golang-regexp-with-non-latin-characters

STRING_LIT = < [\\p{L}\\d_]+>

It will be convenient to define query DSL as following:

a = '世界' and b = 1
yhirose commented 5 years ago

@cch123, it's possible though, the Unicode regex spec is massive: http://unicode.org/reports/tr18/ https://www.regular-expressions.info/unicode.html

It would be easy to start with the Unicode Categories first such as \p{L}.