firasdib / Regex101

This repository is currently only used for issue tracking for www.regex101.com
3.26k stars 199 forks source link

Escaped delimiter in Rust shows error #2034

Closed OnlineCop closed 1 year ago

OnlineCop commented 1 year ago

Bug Description

The default Rust delimiter shows as r"...".

Using an escaped \" double quote shows the error:

\" This token has no special meaning and has thus been rendered erroneous.

Using a doubled "" double quote (as you might use in C#) shows the error:

" An unescaped delimiter must be escaped; in most languages with a backslash (\)

Reproduction steps

For the escaped double quote: https://regex101.com/r/EFiXGc/1 where the expression is \"(.*?)\".

For both an escaped double quote and a doubled double quote: https://regex101.com/r/EFiXGc/2 where the expression is \"(.*?)"" with both the above error for the escaped double quote, as well as the error about an unescaped delimiter.

Expected Outcome

With the default delimiter of ", escaping the delimited character as \" (or however Rust requires it) should match the double-quote character literal.

Browser

Vivaldi (Chromium) 64-bit (v5.7)

User Agent

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/110.0.0.0 Safari/537.36

OS

Windows 10 v21H2

OnlineCop commented 1 year ago

Possible duplicate of #2033.

tgross35 commented 1 year ago

A pattern like \"(.*?)\" with r" delimiters would not be correct, since the \ is ignored and the " are taken as literal string end (let s = r"\"(.*?)\""). Upgrading the delimiters to r#" ... "# should with this pattern though, but it also shows the error

tgross35 commented 1 year ago

For completness, this is the pattern that should be followed:

User selected pattern submit to the WASM function for the escape string disallowed pattern
nothing ignore none
" ... " str " (\" allowed)
r" ... " raw " (\" not allowed - no escapes)
r#" ... "# rawhash1 "#
r##" ... "## rawhash2 "##
r###" ... "### rawhash3 "###
r####" ... "#### rawhash4 "####

The Rust code also validates this, the logic is here https://github.com/tgross35/wasm-regex/blob/46ad6843112b2efac6d43989647858188d55bc10/src/strops.rs#L248-L271 and is actually pretty thorough, since check_unescaped_quotes will correctly allow \\\\\" but not \\\\\\". So I think you could potentially rely on the wasm to do this check, but I'm not sure how the error handling display happens

tgross35 commented 1 year ago

Also, minor but image should be r#" (same for the other hash counts). The closing delimiters look correct.

The dropdown menu and the result of hitting the copy button also don't display the r or the ", I figure this is just because the other languages don't have asymmetrical delimiters.

firasdib commented 1 year ago

Thank you @tgross35

I don't think I can support "..." since I assume you'd have to escape every backslash in this mode.

As for r"...", how do you insert a quote in this mode? Is there no way to do it?

I will adjust the implementation to support the details described in your posts above.

firasdib commented 1 year ago

This will be resolved in the next release.

tgross35 commented 1 year ago

Thanks Firas. To my knowledge, both those statements are correct: ” … " requires escaping (so ”foo\\bar\”baz\n” is valid, ”foo”bar” is not), and there is no way to write a single in a r”…” string (have to use the hashed version)