crystal-lang / crystal

The Crystal Programming Language
https://crystal-lang.org
Apache License 2.0
19.45k stars 1.62k forks source link

It's impossible to create a regex that is MULTILINE but not DOT_ALL #14869

Closed ralsina closed 3 weeks ago

ralsina commented 2 months ago

Bug Report

The expected behaviour when using a MULTILINE regex like #.*$ is that it will match a whole line that begins with #

Because crystal conflates MULTILINE and DOT_ALL, that regex will, when set to MULTILINE, match also any subsequent lines until the end of the file.

Regex.new("#.*$", Regex::Options::MULTILINE).match("# this should be matched\nthis shouldn't")

=> Regex::MatchData("# this should be matched\nthis shouldn't")

To keep backwards compatibility, the behaviour of MULTILINE in crystal should not be changed, but I propose a change that adds a new MULTILINE_ONLY flag that just doesn't set DOT_ALL when creating the underlying PCRE2 object.

ralsina commented 2 months ago

PR: https://github.com/crystal-lang/crystal/pull/14870

oprypin commented 2 months ago

Some previous discussion was here: https://github.com/crystal-lang/crystal/issues/8062

Indeed it's sad that DOTALL is bunched together with MULTILINE.

/(?m)#.*$/ achieves the desired behavior of just MULTILINE, by the way - confusingly it's totally different from /#.*$/m