corazawaf / coraza

OWASP Coraza WAF is a golang modsecurity compatible web application firewall library
https://www.coraza.io
Apache License 2.0
2.27k stars 225 forks source link

Incorrect parsing of regex in SecRule with multiple ARGS specifiers #1233

Open geoolekom opened 10 hours ago

geoolekom commented 10 hours ago

Description

Coraza does not correctly parse regex arguments in SecRule. Let's consider a rule:

SecRule ARGS|!ARGS:/um_options\[/|!ARGS:um_options[checkmail_email] "@rx \x22|<" "t:none,t:urlDecodeUni,deny"

Coraza treats the entire string um_options\[/|!ARGS:um_options[checkmail_email] as the regex instead of isolating um_options\[ as the intended regex. This behavior leads to incorrect rule parsing and potentially unintended behavior when matching against request arguments.

Steps to reproduce

Create a SecRule directive similar to the following:

SecRule ARGS|!ARGS:/um_options\[/|!ARGS:um_options[checkmail_email] "@rx \x22|<" "t:none,t:urlDecodeUni,deny"

Run WAF that uses the rule.

Expected result

Coraza should interpret the regex /um_options\[/ correctly as a standalone match and not combine it with the rest of the string, i.e., |!ARGS:um_options[checkmail_email].

Actual result

Coraza incorrectly parses the regex as um_options\[/|!ARGS:um_options[checkmail_email], leading to an error:

2024/11/22 14:19:37.553922 Failed to configure engine: invalid WAF config from file: failed to parse string: failed to parse string: failed to compile the directive "secrule": error parsing regexp: missing closing ]: `[checkmail_email`

With other regexes, it might lead to unexpected behavior and possibly incorrect matching of arguments.

I’m planning to work on a fix for this issue to make sure that regex parsing in SecRule works as expected. If the maintainers have any concerns or suggestions about this approach, I’d be happy to hear them before moving forward. Thank you!

fzipi commented 10 hours ago

Hopefully this is solved once we incorporate the new parser!

geoolekom commented 9 hours ago

@fzipi This bug happens because ParseVariables function doesn’t handle cases where a regex contains special characters like |. It tries to split everything using |, but if | is part of the regex, it gets confused and treats way more of the input as part of the regex than it should.

It doesn’t seem too hard to fix. We just need to handle this specific case properly and add some tests. No need to wait for a full parser refactoring to fix it. If you don't mind, I'd provide a PR.

fzipi commented 8 hours ago

Please do!

jcchavezs commented 7 hours ago

Please do!