Open/negated charclass predicates like \W and \S were combined incorrectly.
They were interpreted as as not (whitespace or digits) but their correct interpretation is (not whitespace) or (not digits).
This fixes the misinterpretation and the test. It includes a couple minor nits that made it easier to rebase the fix.
Although I didn't include as a test in this PR because there are plausible reasons that the test suite doesn't already do this, I verified the behaviour against the built-in re module with:
import re
from greenery import parse
def test_expr(expr, value):
pat = parse(expr)
rx = re.compile(expr)
assert bool(rx.match(value)) == bool(pat.matches(value))
for expr in (r"\S\D", r"1\D", r"1\D\S"):
for negation in (True, False):
for value in "12x ":
test_expr(f'[{"^" if negation else ""}{expr}]', value)
I also confirmed consistency with some manual tinkering in javascript.
Open/negated charclass predicates like
\W
and\S
were combined incorrectly.They were interpreted as as
not (whitespace or digits)
but their correct interpretation is(not whitespace) or (not digits)
.This fixes the misinterpretation and the test. It includes a couple minor nits that made it easier to rebase the fix.
Although I didn't include as a test in this PR because there are plausible reasons that the test suite doesn't already do this, I verified the behaviour against the built-in
re
module with:I also confirmed consistency with some manual tinkering in javascript.