This PR addresses an issue where lineTerminatorMatcher(excludeCRLF = true) returns an empty Regex AST, which could lead to incorrect behavior when passed unchecked to RegexRepetition.
This update also improves handling of end-of-line characters following a line anchor (e.g., $\r, $\u2028), falling back to CPU due to cuDF's lack of support for negative lookahead. However, checkUnsupported will already catch these cases before reaching this point.
This PR addresses an issue where
lineTerminatorMatcher(excludeCRLF = true)
returns an empty Regex AST, which could lead to incorrect behavior when passed unchecked to RegexRepetition.This update also improves handling of end-of-line characters following a line anchor (e.g.,
$\r
,$\u2028
), falling back to CPU due to cuDF's lack of support for negative lookahead. However, checkUnsupported will already catch these cases before reaching this point.