Closed cage2 closed 3 years ago
character-ranges
has a slightly subtle syntax to allow ranges as well as individual characters to be specified:
(esrap:character-ranges #\UA0 #\UD7FF)
denotes the set of characters consisting of exactly #\UA0
and #\UD7FF
.
(esrap:character-ranges (#\UA0 #\UD7FF))
denotes the set of characters consisting of the range starting at #\UA0
and ending at #\UD7FF
.
This allows specifying multiple ranges as well as individual characters not contained in any range at the same time: (esrap:character-ranges (START₁ END₁) (START₂ END₂) … INDIVIDUAL-CHARACTER₁ INDIVIDUAL-CHARACTER₂ …)
.
I hope this solves your concrete problem and also explains the rationale behind the syntax.
character-ranges
has a slightly subtle syntax to allow ranges as well as individual characters to be specified:
[...]
I hope this solves your concrete problem and also explains the rationale behind the syntax.
-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/scymtym/esrap/issues/13#issuecomment-744005132
This totally makes sense to me, also now the modified parser (according to your suggestions) works like a charm!
Moreover i was actually using the second syntax in other parts of my code, so this issue was (as i suspected) totally a mistake from my part.
Sorry to if i wasted your time with this trivial mistake, sometimes i can not find errors without talking about the issue with other people.
Thank you for kind reply! Bye! C.
Hi!
First i want to say that this library is wonderful, i do not want to use any other parser generator now. ;-)
I am trying to parse some sequence that can contains unicode character (for an IRI parser) and i wrote a rule like that:
but:
fails with:
i even tried a rule like:
but this fails too with trying to parse "ì".
Maybe i am using the library in the wrong way?
Can you, please, help me?
Thank you very much. C.