dart-lang / linter

Linter for Dart.
https://dart.dev/tools/linter-rules
BSD 3-Clause "New" or "Revised" License
635 stars 170 forks source link

valid_regexps should warn about string literal #1049

Open alescdb opened 6 years ago

alescdb commented 6 years ago

The regular expression new RegExp('\(([^\)]*)\)'); is not valid in dart, and (I think) we should be warned about it by lint.

ref : https://github.com/dart-lang/sdk/issues/33669

zoechi commented 6 years ago

Seems to be a valid regex according to the comments in the linked issue. What warning would you expect?

pq commented 6 years ago

If I'm understanding, I think the issue is not that it's an invalid regular expression but rather that it does not match what you think it should (e.g., in this case just a literal string match). I think I know what you mean though. Maybe something along the lines of a warning when you are using a regexp when a string match mould do? In that case, maybe another lint for unecessary_regexp or something? My worry here would be potential complexity and false positives.

Thanks for opening this issue!

zoechi commented 6 years ago

I interpreted it that it should warn about the missing r prefix.

pq commented 6 years ago

I interpreted it that it should warn about the missing r prefix.

Oh! (It's early here; need more ☕️.)

That's easier but likely to flag a bunch of false positives? For example, I count ~12 regexps in the linter package that are valid but not defined w/ raw strings...

alescdb commented 6 years ago

Sorry if my request is not clear :-) The regexp is indeed valid and it compiles, but it does not match as expected in other languages because as @lrhn says in it's comment \(\) is 'equivalent' to (). Maybe a warning when using escaped characters without raw string should be enough ?

pq commented 6 years ago

No worries @alescdb. This is a great conversation. Thanks for kicking it off!

Maybe a warning when using escaped characters without raw string should be enough ?

Interesting idea. My gut says that should be another lint. (These are not technically invalid and may, in odd cases, actually be what you have in mind? 🤔 ) The trick is in writing the write prescription. We're not checking for validity exactly. Maybe we just want something that flags regexp_literal_escapes?

Want to take a stab at writing a sentence or two description? 😄

alescdb commented 6 years ago

If escaping parenthesis (\(.?*)\) is 'translated' in an unescaped parenthesis ((.?*)) why would you escape it on purpose ? I thougth it was a RegExp bug because in my opinion this is not a normal behavior compared to other languages (C#, PHP, grep, ...)

pq commented 6 years ago

Ah, OK. I honestly have to look up the grammar for regexps every time I use them for exactly the reason you describe. 🙄 Definitely annoying and if analysis can help, all the better.

Flagging this one as "help wanted" since I'm lacking the required fluency and updating the title too.

Thanks!

/cc @bwilkerson

pq commented 6 years ago

For the curious, Dart regular expressions have the same syntax and semantics as JavaScript regular expressions. Here's the relevant bit of grammar:

http://ecma-international.org/ecma-262/5.1/#sec-15.10.1