Closed Minty-Meeo closed 1 year ago
Hi,
1) in multiline mode currently CTRE only consider \n
as new line. Hence .+
will eat the \r
, the optional \r?
won't match and then \n
will match. That's the current state of thing, and properly looking at \r\n
would fix that (changes would be needed in evaluation.hpp
)
2) you can avoid this issue by not using .+
and instead [^\r\n]+
as this will trigger an optimization and modify the loop into a possessive one, and it will give you much faster regex
3) (bonus) if you are parsing some sort of a document, look at the ctre::tokenize or ctre::range:
for (auto match: ctre::tokenize<regex>(subject)) {
// each match
}
ctre::tokenize
is an equivalent of repeated calls of ctre::starts_with
ctre::range
is an equivalent of ctre::search
.+?
which is a lazy loop, it will always try next character before looping again.Thank you for your quick response.
STL regex: https://godbolt.org/z/vMqbTdz3a CTRE regex: https://godbolt.org/z/fKvGKjTc1
I am attempting to switch from std::regex for a project which reads a multi-line text file sort of like a binary file by using regular expressions. I found that a '?' quantifier was useful for supporting both CRLF and LF line endings, but upon switching to CTRE, my code was broken. For some reason, the carriage return character is being captured. Is this a defect of CTRE, or am I doing something wrong?