tailwindlabs / tailwindcss-intellisense

Intelligent Tailwind CSS tooling for Visual Studio Code
2.75k stars 183 forks source link

Fix crash when class regex matches an empty string #897

Closed thecrypticace closed 5 months ago

thecrypticace commented 6 months ago

This removes becke-ch--regex--s0-0-v1--base--pl--lib in lieu of native support for capture group indices. This is supported in Node 16+ so will work in VSCode — additionally all modern browsers have supported this since late September 2021 (Safari 15, Chrome/Edge 90, Opera 76, FF 88) so this shouldn't be an issue for Tailwind Play either.

As part of this change I've also done some refactoring to reduce duplication in code that used the above package — and added more tests!

Fixes #893

thecrypticace commented 6 months ago

@mfb-davidmay I think that's worth doing but as a separate thing. Adding logging here could be quite noisy as it's used during editing when triggering completions.

thecrypticace commented 6 months ago

Bit of a stopper at the moment — running this with the regexes:

/tv\((([^()]*|\([^()]*\))*)\)/gd
/["'`]([^"'`]*).*?["'`]/gd

Results in what looks to be an infinite loop in v8. It'll proceed after exec() a handful of times and then it'll all of a sudden stop working and exec never returns.

Running a simple profile against the process looks like its spending time inside v8::internal::RegExpMacroAssembler::LoadCurrentCharacter

It happens reliably in my environment:

Version: 1.85.1 (Universal)
Commit: 0ee08df0cf4527e40edc9aa28f4b5bd38bbff2b2
Date: 2023-12-13T09:48:06.308Z (3 wks ago)
Electron: 25.9.7
ElectronBuildId: 25551756
Chromium: 114.0.5735.289
Node.js: 18.15.0
V8: 11.4.183.29-electron.0
OS: Darwin arm64 23.2.0

I need to test this against a similar node.js / v8 version to see if I can reproduce it there.

thecrypticace commented 5 months ago

Okay I guess I wasn't paying too close attention to the regex but it has nested * specifiers which means that it's subject to catastrophic backtracking.

Take the regex: /tv\((([^()]*|\([^()]*\))*)\)/gd

If you look closely you can see an external group with * and, inside that group, another one that uses * too. This means that the internal group can match zero or more times AND the external group can.

Here is a simplified regex with the same situation — but less capture groups and alternations: /v\(((?:[^()]*)*)\)/gd

On a regex like this it takes almost 20s to match the string v({ foobar: '' // aaaaaaaaaaaaaaa on my M3 Max. And this is exponentially worse for every character added on the order of 2^n. At only 79 as it'd take my computer nearly 1,000x longer than the current age of the universe to compute. I think thats a little longer than I'd be willing to wait.

All in all, this isn't a bug, just a regex that's going to take quite literally forever if it doesn't immediately find a match.

That means this should be safe to merge. 😀