In a negated character class that has overlapping content, such as [^\n\s], the normalisation code is violating a precondition of IntCharSet.sub() and leaves the class content in an inconsistent state. This either triggers an exception at generation time if another set operation interacts with the inconsistent part, or may lead to matching wrong input at runtime if nothing else interacts with the set.
This PR fixes the problem by first computing the union of the class content \n\s, which becomes a single set (joining the overlapping parts) and then computing the complement of that set.
[x] enforce invariant in Interval class
[x] avoid violating sub precondition
[x] add regression test case for negating overlapping char class content
In a negated character class that has overlapping content, such as
[^\n\s]
, the normalisation code is violating a precondition ofIntCharSet.sub()
and leaves the class content in an inconsistent state. This either triggers an exception at generation time if another set operation interacts with the inconsistent part, or may lead to matching wrong input at runtime if nothing else interacts with the set.This PR fixes the problem by first computing the union of the class content
\n\s
, which becomes a single set (joining the overlapping parts) and then computing the complement of that set.Interval
classsub
preconditionFixes #1065