quick-lint / quick-lint-js

quick-lint-js finds bugs in JavaScript programs
https://quick-lint-js.com
GNU General Public License v3.0
1.52k stars 191 forks source link

12$: warn on combining characters in regexp character classes #1176

Open strager opened 7 months ago

strager commented 7 months ago

https://biomejs.dev/linter/rules/no-misleading-character-class/

CoderMuffin commented 5 months ago

Hi @strager, I would be interested in tackling this issue :) Are strings within the parser encoded in any specific format (UTF-8, UTF-16**) or does it need to be able to "catch-all"?

CoderMuffin commented 3 months ago

Hi @strager, Sorry for the ping, just wondering if there was any update on this? Thanks :)

strager commented 3 months ago

Are strings within the parser encoded in any specific format (UTF-8, UTF-16**) or does it need to be able to "catch-all"?

Within quick-lint-js's parser, source code is in UTF-8.

Here's the code which detects character classes in regexps: https://github.com/quick-lint/quick-lint-js/blob/68bd5cb4f49b847511ad7c6c09e12b5a3cb5689d/src/quick-lint-js/fe/lex.cpp#L1183 It's bare-bones because we don't do anything with the character classes currently. Feel free to rip it apart to implement this diagnostic.