uhop / node-re2

node.js bindings for RE2: fast, safe alternative to backtracking regular expression engines.
Other
479 stars 53 forks source link

New nonbacktracking algorithm for lookbehinds and lookarounds #199

Open donabrams opened 7 months ago

donabrams commented 7 months ago

I noticed a very recent paper by @aurele-barriere & @cpitclaudel that describes an algorithm for js regex lookbehinds and lookarounds that isn't vulnerable to ReDoS. Think it'd be worth adding to re2? I'm unemployed ATM, so this could be fun for me if it's the kind of direction you'd like to go.

uhop commented 7 months ago

That would be very cool.

I suggest to write a minimal C/C++ implementation — we can dress it up as a Node extension later using the technique I used for node-re2, Node-API, or wasm.

Obviously, you can start with a POC written in JS, which is neat by itself.

Ultimately, if it works, it can be a sister project for node-re2.

Aurele-Barriere commented 7 months ago

Hi, Thanks for posting our work here! We are currently focusing on implementing one version of that algorithm in the V8 JavaScript engine: https://bugs.chromium.org/p/v8/issues/detail?id=14435

Let us know if we can be of any assistance

cpitclaudel commented 7 months ago

Hey @donabrams and @uhop! As @Aurele-Barriere said, let us know if we can help — we'd love to see an implementation of our algorithms in re2 :)

uhop commented 7 months ago

@Aurele-Barriere and @cpitclaudel: I want to clarify that I am not a maintainer of google/re2. I am a maintainer of Node bindings for that library interested in using modern fast tools, which are stable in the case of ReDOS yet mimic the standard JS regular expressions as much as possible so it can be used as a drop-in replacement.

While I can switch the underlying libraries, if you are interested in incorporating your code in google/re2, you should talk to different people. Naturally, I assume that would be some faceless shirts from some bowels of Google. :-D

uhop commented 1 month ago

I finally read the article and it is excellent! That’s exactly what we, practitioners, need! Major kudos to @Aurele-Barriere and @cpitclaudel for all hard work they did with regular expressions! I hope seeing the results in the wild soon.