SWI-Prolog / packages-pcre

SWI-Prolog package for access to Perl Regular Expressions
5 stars 7 forks source link

re_match/3 with text ignores options #7

Closed kamahen closed 2 years ago

kamahen commented 2 years ago

re_match(Regex, String, Options) should behave the same as re_compile(Regex, Re, Options), re_match(Re, String). But it doesn't:

re_flush, re_compile('b', Re, [caseless(true)]), re_match(Re, "ABC"). %succeeds
re_flush, re_match('b', "ABC", [caseless(true)]).  % fails

A partial work-around is to use the form re_match(Regex/Flags, String), but that doesn't allow all the possible options (e.g., compat(javascript).

kamahen commented 2 years ago

The cause appears to be that re_pool/3 contains re_pool(Text, Flags, Regex) but should contain re_pool(Text, Flags, Options, Regex). I plan on fixing this as part of the PCRE1->PCRE2 migration. (Issue #2)

JanWielemaker commented 2 years ago

Yes. The original design was that the translation of Pattern/flags was cached and options passed to the regex execution predicates only affected the execution, not the compilation of the regex. That is why I suggested adding a notation for options to the flags. As you now have an option list dealing both with compilation and execution you'll have to extract the compilation options and use these as well in the caching. That probably gets the performance bottleneck using regular expressions for fairly small operations. One option may be to add goal_expansion/2 rules to do some of the work at compile time. Unfortunately you cannot make the blob part of the generated clause as that would break creating a saved state (or qlf file). So, you'll need to generate something similar to the Pattern/Flags. Well, in theory you can add blobs to the saved state and qlf files if you define the corresponding functions on the blob type to define its (de)serialization.

kamahen commented 2 years ago

Related: https://github.com/SWI-Prolog/packages-pcre/issues/8#issuecomment-1008073224 For consistency, I propose allowing both flags and options, with flags taking precedence. (PCRE2 has even more options, but I see no need to add to the flags; the Options can take care of the more obscure PCRE2 flags.)

kamahen commented 2 years ago

Fixed with https://github.com/SWI-Prolog/packages-pcre/commit/fe4775da69735285c4461ff2d3aa137b783fdabf (or earlier).