SpiderLabs / owasp-modsecurity-crs

OWASP ModSecurity Core Rule Set (CRS) Project (Official Repository)
https://modsecurity.org/crs
Apache License 2.0
2.45k stars 726 forks source link

Update rx 942450 #1662

Closed wjwoodson closed 4 years ago

wjwoodson commented 4 years ago

Update regexp in 942450 to reduce false positives. This is essentially the change to require \b before the potential hex string mentioned in https://github.com/SpiderLabs/owasp-modsecurity-crs/issues/833.

airween commented 4 years ago

Hi @wjwoodson,

thanks for the fix - I see that the regression tests passed successfully, but I just checked your regex with pcre2test, and looks like it doesn't work:

  re> /(?i:\b[^\d]?0x[a-f\d]{3,})/
data> %5cA0xf00dsdfdsa
No match
data> foo0xf00
No match

Here is the old version:

  re> /(?i:(?:\A|[^\d])0x[a-f\d]{3,})/
data> %5cA0xf00dsdfdsa
 0: A0xf00d
data> foo0xf00
 0: o0xf00

I think we need more check.

wjwoodson commented 4 years ago

Hi @airween, I believe the urlDecodeUni transform in this rule is responsible for the disparity -- %5cA0xf00dsdfdsa should transform to \A0xf00dsdfdsa which will match.

fgsch commented 4 years ago

I think you could drop the [^\d]? but otherwise looks good to me.

fgsch commented 4 years ago

Actually, @airween is correct in that your pattern will not match e.g. fooA0xf00 (or %5cA0xf00dsdfdsa for the matter) because the \b after the [^\d]? needs a different type before the A (both o and A are in \w).

That said, I think fooAOxf00 (or %5cA0xf00dsdfdsa) should not be matching. In other words, it should recognise 0x[a-f0-9]{3,} iff is not part of a longer string.

wjwoodson commented 4 years ago

Yes var=foo0xf00 is a negative test and is intended not to match. As for [^\d]? I was really just trying to stay consistent with the existing test var=\A0xf00dsdfdsa. If it makes sense to you I will just simplify the expression to \b0x[a-f0-9]{3,} and update tests to match.

fgsch commented 4 years ago

I'm inclined to say so (drop the [^\d]?) but let's wait for other folks' input.

airween commented 4 years ago

I believe the urlDecodeUni transform in this rule is responsible for the disparity -- %5cA0xf00dsdfdsa should transform to \A0xf00dsdfdsa which will match.

ah, sorry, you're right, I totally forgot to check the rule. Sorry again.

dune73 commented 4 years ago

This looks like a good progress on this front. You guys understand much more of this than I do, but I'd lean on dropping the [^\d] too.

942450 is a bit thin on tests. Here is some stuff to enhance the test suite. If you find it useful @wjwoodson, then it might pay to pick some, rework them a bit and add them to the unit tests.

942450-tests.yaml.tar.gz

wjwoodson commented 4 years ago

Hi all, I've updated to remove [^\d]? from the expression and updated/added tests.

fgsch commented 4 years ago

LGTM.

dune73 commented 4 years ago

Thanks for the PR, the update and the additional tests, @wjwoodson. Merging now.