Open gqgs opened 3 years ago
We could replace this by the RE2 (https://github.com/google/re2). There is python bindings available (https://pypi.org/project/google-re2/).
@rllola
Many zites make use of (?!...)
and RE2 doesn't seem to support it. (https://github.com/google/re2/wiki/Syntax)
The problem is we neither check for formal allowed regexp syntax, nor have the formal definition at all. Our regexp syntax is implicitly python re
syntax.
Not sure if it is possible to move to RE2 in a backward compatible way.
https://github.com/zeronet-enhanced/ZeroNet/commit/2a25d61b968a21aa98c6db2ca9d64f1bbdc54773
In my fork, I (temporarily) fixed this by treating ?
s in the same ways as other "repetitions", so the total number of repetition markers cannot exceed 9.
Not sure if it is a proper or a complete solution. I'm not familiar with the ReDoS type of attack and regexp implementation details.
Step 1: Please describe your environment
Step 2: Describe the problem:
"To avoid the ReDoS algorithmic complexity attack" the function bellow is used to validate user defined regular expressions.
https://github.com/HelloZeroNet/ZeroNet/blob/454c0b2e7e000fda7000cba49027541fbf327b96/src/util/SafeRe.py#L10-L22
This function fails to identify regular expressions that can require exponential time complexity to match user inputs.
Steps to reproduce:
Observed Results:
match
hangs and the execution never completes.Expected Results:
isSafePattern
should properly detect that the pattern is unsafe. Alternatively,match
should use an algorithm with guaranteed linear time complexity to compile and match inputs (e.g. Thompson NFA).