Open jpbro opened 7 years ago
See changes to modTests.TestRegex2 method for a demonstration as per commit https://github.com/jpbro/VbPcre2/commit/66c88a1db0ebeda970392a9ead25724557499481
I see.
Strange, that PCRE2 produces in fact only one significant result: File1.zip.exe instead of three.
However, I think such .*$
regexp is incorrect at all. It is the same like (about)?
regexp. Mean: you are trying to find empty string (as one of true results). If you enter such regexp e.g. on some online java regexp tester it will produce error
, mean that regexp should not allow an empty strings as one of the true results of execution.
From this point of view, difference in results between VBS/PCRE2 is only a matter of its internal error handler mechanism which has different realization.
So, in real .+$
shoud be used instead of .*$
.
As a conclusion, personally I believe that it is not necessary to touch such behavior. Anyway, if I would change something, I would detect regexp string that allow empty result and replace result string with raising error.
Although, if VBS already produces the most complete result, I still would not have refused if PCRE2 would produces the same result to support strategy of PCRE2 as analogue of VBS.Regexp
to show at least these 3 lines for .*$
But I don't khow, how you can track such cases and not break anything else.
Yeah it's a bit of a weird one - interesting that some online regexp sites produce an error, but PCRE2 and VBScript produce results (albeit different). Makes it hard to know what the best approach is.
It might be that there is a PCRE2 option flag to handle this situation, I'll ahve to look at them all more closely (or maybe it's just up to my Global matching loop to work a bit differently to produce the same results as VBScript).
I don't have time to look closer right now, but I will try soon.
According to my tests, no option pre-defined in your class allow to change behavior, except:
Dot matches All Characters which affect all text falls into first substring, like:
Match Count: 2
#1: File1.zip.exe
File2.com
File 3
Sub.#1:
#2:
Sub.#2:
Who is right?
Both results are correct. The wrong here is your expectation.
Multiline = True
in VBScript's RegExp simply means ^$ match at line breaks
which is an option that must explicitly set (as you did in VBScript) for PCRE, namely PCRE2_MULTILINE
.
So it seems OK, you just changed the default behavior for VBScript but not for PCRE in your test.
Oh, I forgot to mention. I've never used your wrapper.
If you're sure that the PCRE2_MULTILINE
flag is set in your wrapper that means a problem of your wrapper or PCRE. VBScript's RegExp works as it should in this case.
When Multiline = TRUE and Global = True (for VbScript/NA for my wrapper) consider the following subject:
"File1.zip.exe" & vbCrLf & "File2.com" & vbCrLf & "File 3"
And the following regex:
.*$
VBScript returns 6 matches, but my wrapper returns only 2. Who is right?