Open fangly opened 7 years ago
Rex makes no attempt to verify a given regular expression is actually valid, I am pretty confident there are plenty of ways you can construct an invalid regular expression using it.
I agree it would be nice if constructs threw an error, but I am not sure it is worth complicating the implementation to support it.
You are right on all fronts, Jim. And in fact, I got errors from PCRE in some instances (malformed regular expressions). What I find problematic is a wrong result without any warning or error!
Based on my non-exhaustive knowledge of the rex package, I would suggest that only a single argument should be allowed for character class functions like one_of(...)
, any_of(...)
or none_of(...)
since correctness cannot be ensured (and is too difficult to reliably implement for all cases). Surely, users could still manually construct complex character classes using character_class()
, or manipulate existing character classes with wildcards and boolean operations like maybe()
, zero_or_more()
, or()
and not()
.
Hi,
I have issues when creating a new character class that combines several existing character classes including one or several ones that are negated (in rex 1.1.1).
Creating a character class that contains negated and non-negated classes
As far as I know this is not possible at all. The caret "^" must be directly after the opening bracket "[" for it to trigger a negation. I think combining negated and non-negated character classes should error, with an error message suggesting an alternative. In the example above, an alternative would be:
Creating a character class combining only negated classes
But the resulting regular expression should be "[^[:digit:][:lower:]]". Though the regular expression seems to work as intended, it would be safer to correct it.
Cheers, Florent