NeilFraser / JS-Interpreter

A sandboxed JavaScript interpreter in JavaScript.
Apache License 2.0
2k stars 353 forks source link

Regular expressions can take arbitrarily long to execute and block .step() #152

Closed MrTrick closed 5 years ago

MrTrick commented 5 years ago

eg code like:

var before = Date.now();
var match = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaab".match(/^(a|aa+)+a+$/);
var after = Date.now();
alert("Result: " + JSON.stringify(match) + ", Took: " + (after-before) + " milliseconds");

More info on the issue at https://www.rexegg.com/regex-explosive-quantifiers.html

NeilFraser commented 5 years ago

Awesome vulnerability. Adding @cpcallen

Regular expressions are handled natively. The only solution I can think of at the moment is to handle them in the interpreter, something I really don't want to have to program.

NeilFraser commented 5 years ago

One approach would be to pass the regex match to a web worker as an async function. Then one could establish a timeout for matches. This feature wouldn't work in IE 9 (a currently supported browser) since it doesn't have web workers.

MrTrick commented 5 years ago

I wonder if it's possible to analyse regexes for that risky kind of pattern?

Or could we have a flag to disable regex support?

The web worker idea is a good one, though I'm more interested in running something on the server.

NeilFraser commented 5 years ago

I wonder if it's possible to analyse regexes for that risky kind of pattern?

This paper (linked from the bottom of the page you linked to) does present a method of detecting risky regexes: https://arxiv.org/pdf/1301.0849.pdf However, it doesn't look simple. And reading between the lines it seems to indicate that it is not comprehensive in its ability to detect all risky regexes.

Or could we have a flag to disable regex support?

That's definitely possible, and relatively easy to do too.

The web worker idea is a good one, though I'm more interested in running something on the server.

Node appears to have worker threads: https://nodejs.org/api/worker_threads.html

cpcallen commented 5 years ago

…I'm more interested in running something on the server.

I think that Node's vm module is probably the way to go, server-side, since it allows a timeout to be specified explicitly.

We're looking at using this for hard-enforcing timeouts generally (including all builtins), since there are other operations which can potentially be very slow (albeit probably not as awful as exponentially-exploding RegExps). I've not actually looked at the details of how to do this in the general case, but for RegExps specifically it should be trivial enough to pass of the RegExp and target string to vm.runInNewContext.

NeilFraser commented 5 years ago

Update on this bug. I just pushed a change that adds three options for how regular expressions are executed:

Interpreter.prototype.REGEXP_MODE = 0; // throw as invalid.
Interpreter.prototype.REGEXP_MODE = 1; // execute natively (risk of unresponsive program).
Interpreter.prototype.REGEXP_MODE = 2; // execute in separate thread (not supported by IE 9).

Currently option 1 is the default, so there's no change in behaviour. Option 2 only supports Web Workers at the moment.

TODO list:

NeilFraser commented 5 years ago

Done. https://neil.fraser.name/software/JS-Interpreter/demos/regexp.html

Thanks @MrTrick for an interesting issue.