haskell-hvr / regex-tdfa

Pure Haskell Tagged DFA Backend for "Text.Regex" (regex-base)
http://hackage.haskell.org/package/regex-tdfa
Other
36 stars 9 forks source link

Multiline option appears to be ignored #11

Open raxod502 opened 4 years ago

raxod502 commented 4 years ago

In GHCI:

Text.Regex.TDFA> "f" =~~ "." :: Maybe String
Just "f"
Text.Regex.TDFA> "\n" =~~ "." :: Maybe String
Nothing

The documentation says multiline matching should be enabled by default, but just to double-check, I tried enabling it explicitly:

Text.Regex.TDFA> (makeRegexOpts defaultCompOpt { multiline = True } defaultExecOpt "." :: Regex) `matchM` "f" :: Maybe String
Just "f"
Text.Regex.TDFA> (makeRegexOpts defaultCompOpt { multiline = True } defaultExecOpt "." :: Regex) `matchM` "\n" :: Maybe String
Nothing

No luck. The same thing happens for inverted character classes (e.g., [^&] instead of .). Am I misunderstanding how to get my regex to match newlines?

obfusk commented 3 years ago

regex-tdfa seems to have a non-standard multiline mode that combines what is usually known as "multiline" (i.e. having "^" and "$" match at the beginning/end of individual lines, not just the whole string) with inverse "dotall" and also disables matching newlines in inverted character classes (so you can't even use e.g. [^&] as you mentioned).

You can match a newline using "(.|\n)", but only with an actual newline in the pattern (since \n is just n to regex-tdfa).

> ("\n" =~~ "(.|\n)") :: Maybe String 
Just "\n"

It does recognise [[:space:]] though, so this works as well (in case your pattern is e.g. user input that can't contain a newline):

> ("\n" =~~ "(.|[[:space:]])") :: Maybe String 
Just "\n"

If you don't care about "^" and "$" but want "." to match newlines, use multiline = False:

> let p = makeRegexOpts defaultCompOpt { multiline = False } defaultExecOpt "."
> p `matchM` "\n" :: Maybe String 
Just "\n"
raxod502 commented 3 years ago

Thanks for the pointer. I think this issue then becomes a request for the documentation to include this helpful information.