Open ChALkeR opened 3 years ago
Thank you for your (very thorough) report!
I agree that this looks like a false positive. I will ping the JavaScript team to see if they can figure out what's going on.
That looks like a TP to me (at least the regular expression part).
What many forget is that a missing start anchor means that the regular expression will match any prefix.
So the regular expression /\/?[^/]*$/
is equivalent to the regular expression /^[^]*\/?[^/]*$/
.
(If the syntax is confusing, then this page might help).
So there are actually 3 parts in the regular expression. [^]*
, \/?
, and [^/]*
.
The first and the third both match a potential infinite string, and they both match many of the same characters, which can cause polynomial ReDoS.
Here is a little proof of concept you can run to see the worst-case runtime:
var reg = /\/?[^/]*$/;
var str = 'a';
while(true) {
const startTime = new Date().getTime();
reg.test(str + "/");
const endTime = new Date().getTime();
console.log("Str size:" + str.length + " | Time: " + (endTime - startTime) + "ms");
str += str;
}
The performance is only quadratic in the size of the input, so it's not that bad.
@erik-krogh Hm. That's basically the same as [^/]*$
or a*$
in this case, which shows the exact same results.
It's... interesting that this isn't optimized automatically.
Thanks!
In this case, e.g. a*$
also should be detected as it's also quadratic in v8 impl:
let reg = /a*$/;
let str = 'a';
while(true) {
const startTime = new Date().getTime();
reg.test(str + "/");
const endTime = new Date().getTime();
console.log("Str size:" + str.length + " | Time: " + (endTime - startTime) + "ms");
str = 'a'.repeat(str.length * 2); // + removed
}
cc @davisjam perhaps to confirm the above?
@ChALkeR As of ~2 years ago (when I took measurements), regexes like a*$
can be quadratic in many regexes engines under partial-match semantics, including V8. These regex engines try each starting offset, i.e. implicitly adding a leading ^.*?
.
Description of the false positive
https://lgtm.com/projects/g/ExodusMovement/schemasafe/snapshot/5d16dfc8e862db856e8bef5c5f92845546eb6c05/files/src/pointer.js?sort=name&dir=ASC&mode=heatmap#x42c7ec6a02df0e9e:1
I don't think that
/\/?[^/]*$/
has a polynomial ReDoS.The actual regex pattern is
/?[^/]*$
.1) The first part has length limited to 1 (i.e. 0 or 1 repetitions).
2) The first and the second parts don't have common allowed symbols (the first allows only
/
, the second doesn't allow/
), so there should be no polynomial redos possible here, even if the first part would have allowed more repetitions.Any of these two would suffice alone to make it not a ReDoS.
Also https://github.com/NicolaasWeideman/RegexStaticAnalysis returns this:
Unlike, for example,
x*[^/]*$
which does not satisfy any of (1) and (2) above and has a polynomial ReDoS (IDA):Note that Weideman's tool analysis typically has to be rechecked because of differences in regex engines. And some patterns are not analyzable by it and require preparation.
Also refs: https://github.com/github/codeql/issues/2804#issuecomment-715554498