firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.2k stars 198 forks source link

Delimiter/quotes issue with regex with escape char #2304

Closed yelliver closed 1 day ago

yelliver commented 2 days ago

Bug Description

Reproduction steps

image

Expected Outcome

"\\s"

Browser

any

OS

any

working-name commented 1 day ago

Hello @yelliver

The site is not a javascript fiddle, it's a pure regex tester. Javascript doesn't actually touch your regex or test string. What is called "flavor" is the set of rules for the regex engine that javascript implemented.

Code Generator under tools would yield what you expect. Keep in mind that the site doesn't "conform" your regex to the programming language you select in code generator, the regex is unchanged, just escaped as an example.

Hope this helps :)

yelliver commented 1 day ago

Could you please explain the purpose of " when click the copy button? Honestly I don't know what scenario that someone directly uses that clipboard result (with ") anywhere. I mean if it does not escape, then it should not include the " character

working-name commented 1 day ago

Sure! That is called a delimiter, and by default it's /. So /test/gm is the same as "test"gm. Different regex engines allow different delimiters.

The main purpose of it is to mark where the regex beings, ends, and what flags are to be applied (the gm are flags that modify the regex engine's behavior). This means that if you paste what you copied in a new regex101.com window, it will parse it out and separate the regex from its flags, and use the delimiter you chose if available in that flavor.

Basically you're talking to the regex engine directly, rather than relying on string processing for whatever programming language.

I understand that it can get confusing because some programming languages can abstract that internally so you don't need to deal with it. If we keep with javascript, there are 2 ways I know of to write a regex:

"This string is full of words!".match(/\w+/gm)
// results in
[
    "This",
    "string",
    "is",
    "full",
    "of",
    "words"
]

or this way:

"This string is full of words!".match(new RegExp('\\w+', 'gm'))

Notice how you need to escape \ for RegExp but you don't for the literal /.../gm notation. These are all artifacts of the language (javascript). Now, if you wanted to match a / inside the regex, the story changes:

"this/that".match(/\w+\/\w+/)
// or...
"this/that".match(new RegExp('\\w+\/\\w+', 'gm'))