Closed SlMaker closed 4 years ago
it is false error. / slash must not be escaped (it's optional for it).
@SlMaker just select for example 'Python' flavour on https://regex101.com/ and the expression will work.
May be we have a bug in https://regex.sorokin.engineer/en/latest/regular_expressions.html ? Do you see someplace were it said "you should escape slash (not backslash)"?
Maybe you should note which flavour you're following. Thought it is PCRE which is the most used one...
@andgineer decide what to do with dot inside [].
@Alexey-T
this question is about /
what do you mean about dot inside []
?
Maybe you should note which flavour you're following. Thought it is PCRE which is the most used one...
Well you know I wrote the lib more than 20 years ago.. So this is not clear am I follow or vice versa :)
I have documentation exactly about this implementation on regex.sorokin.engineer and in fact I have never compared that with some specific flavour. May be I should do that, you are right - a lot of things changed in so many years..
@andgineer see PR #157 where the regex with dot inside a set give wrong and totally unexpected results.
It didn't behaved like that with dots inside sets before which breaks some regexes for me
could you give exact example?
gave them in #157:
RunTest43 Failed: Replace failed expected: <- . -> but was: <Pictures-2018.Spain-Madrid> RunTest44 Failed: Replace failed expected: <.Test.> but was: <This.Is.A.Test.1234_Test_abc>
see https://travis-ci.org/github/andgineer/TRegExpr/jobs/688913370, don't know why it says success if two tests failed!
regex ([.-]Test[.-])
shouldn't return This.Is.A.Test.1234_Test_abc
, it should only match .Test.
as every other regex on regex101 does and also the older versions of TRegExpr.
@SlMaker really strange - test failes but exit code 0.. I am not familiar with the test framework but it did report fails as non-zero code before..
@Alexey-T could you help with debugging the failed testcases? RE pseudo-code looks ok but TRegExpr just do not match anything so it returns initial string - Replace changes only parts of the input string that match re
For Delphi it returns an exit code but the FPC tests doesn't. But uses different test frameworks so don't know how to fix
finder is fixed in my PR today. now last RE test fails. Test44. why? it calls finder, then calls Replace. finder works ok. Replace (it calls Substitute) gets whole string as result - it is some bug. I didnt find it. yet @andgineer
maybe ALL replace-tests are wrong. see this:
// 2
(
expression: '(\w*)';
inputText: 'name.ext';
substitutionText: '$1.new';
expectedResult: 'name.new.new.ext.new.new';
MatchStart: 0
),
Why? such result ok? it's not my work (regex find code work).
Testing it on https://regex101.com/r/HblhQF/1 gives also 4 matches: name, empty, ext, empty
\w*
matches any word character (equal to [a-zA-Z0-9_]
)*
Quantifier — Matches between zero and unlimited times, as many times as possiblename match: name.new empty match: .new ext match: ext.new empty match: .new but the dot between new and ext shouldn't be there?
yes Replace
replaces only matched part an leave all others untouched..
And all the tests was green before..
And last test still does not work - just do not see any matches
Test: This.Is.A.Test.1234_Test_abc
Modifiers "gsr-imx"
Regular expression: ([.-]Test[.-]) ,
compiled into p-code:
1:BRANCH (88)
10:OPEN[1] (19)
19:BRANCH (79)
28:ANYOF (45) Ch(.-)
45:EXACTLY (62) Test
62:ANYOF (79) Ch(.-)
79:CLOSE[1] (88)
88:END (0)
First charset: -.
Expression found at: -1
BTW this is wrong - .
without escaping should be metachar any char
and not just dot itself as in opcode
@andgineer you tested wrong somehow, see my test with last TRegexpr
last test fails? yes, and I cannot get why. some bug in Replace? delete Test44 please.
Why deleting the test? The test is correct. Better fix the bug!
So it is intended that /
does not to be escaped?
If yes, please document and close this issue
Sorry I am confused - where to document this? In the documentation now: https://regex.sorokin.engineer/en/latest/regular_expressions.html#escaping
you can prefix (or escape) with \ any character that has special meaning in regular expressions.
but slash /
does not have any special meaning in re..
If you use the regex .*?\b(https?|ftp)://(?:\w+)\.(?:\w+)\.(\w+)
from above it'll result in an error for PCRE and Javascript flavour on regex101. Probably because you can do: \stest\s/i
for case-insensitive.
See also https://stackoverflow.com/questions/6076229/escaping-a-forward-slash-in-a-regular-expression where they say it's for pattern delimiter and need escaping depending on flavour.
yes you are right - they use /i and so on in TRegExpr we use (?i) https://regex.sorokin.engineer/en/latest/regular_expressions.html#imsgxr-imsgxr
and slash
does not have any special meaning so you doe not HAVE TO escape it.
but if you want regex compatible with other regex flavours you CAN escape it and TRegExpr return the same result as without escaping.
from test 29:
.*?\b(https?|ftp)://(?:\w+)\.(?:\w+)\.(\w+)
https://regex101.com/ says:
/ An unescaped delimiter must be escaped with a backslash (\)