feat: Allow regular expressions in ctl:ruleRemoveTargetByX variable names II.

airween commented 1 month ago

This PR is the renewal and addition of the PR #1683, and solves #911.

Example:

SecRule REQUEST_URI "@beginswith /index.php"
    "id:1001,phase:1,pass,nolog,
    ctl:ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/"

The new patch works with PCRE2 too.

Actually it is only for v2, so until there is no patch for v3 I don't want to merge it (hope I can do that soon).

Many thanks for your work, @vvidic, the credit is yours.

sonarcloud[bot] commented 1 month ago

Quality Gate failed

Failed conditions
3 Security Hotspots
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

airween commented 1 month ago

strlen(value) could be cached in a variable, as it's used several times

Thanks, fixed in c796804adf13b5c0efe2de430a3aa68360179c0a.

airween commented 1 month ago

if(strlen(value) > 2 && value[0] == '/' && ... Better to begin with check of value[0] as, most of the time, the test will be false.

I introduced the value_len which is an int, I think now the order is no matter anymore, so I kept it as is.

airween commented 1 month ago

if (regex == NULL) { ... We should return 0 immediately to simplify the flow.

Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

airween commented 1 month ago

if((strlen(myvalue) == strlen(value)) && strncasecmp(myvalue,value,strlen(myvalue)) == 0) Why not if(strcasecmp(myvalue,value,strlen(myvalue)) == 0)

strlen(value) is replaced by value_len, and I think it's faster than the strncasecmp(). If the length of strings are not equal, then the engine does not start to compare the strings. I think this make sense.

airween commented 1 month ago

... Simplify: ...

Thanks, this make sense - see 1790659237b1071a1d4e39ad03d484ce5a6a5f08.

airween commented 1 month ago

Inside if(strncasecmp(myname, name,strlen(myname)) == 0) there's a test if(value != NULL && myvalue != NULL) If value is not NULL, like TX:a, REQUEST_HEADERS:abc, 'myvalue' (corresponding to the actual target) cannot be NULL (TX alone is not possible). Thus if(value != NULL) is sufficient

After a quick analysis I'm not sure about that. value holds the inspected exclusion target in the cycle. But myvalue can be NULL even the value is not null. myvalue derived from the inspected target.

Even though it might myvalue depends on value, but an extra comparing is not a big load. I suggest we should keep that there.

airween commented 1 month ago

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:

SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"

which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there.

marcstern commented 4 weeks ago

Thus if(value != NULL) is sufficient

Using both checks prevents the segfault. If the user's config is wrong, then it can lead to myvalue is NULL but value is not, eg:
SecRule REQUEST_FILENAME "@rx /admin" \
    "id:1001,\
    phase:1,\
    t:none,\
    nolog,\
    ctl:ruleRemoveTargetById=2001;REMOTE_HOST:/^foo.*$/"
which makes no sense, but if the user confuses a collection with a variable, then it could be a problem. I suggest to keep that there. This faulty syntax should be detected and refused, but that's another issue. If we want to support that case, then if (value && *value) is sufficient

if (regex == NULL) { ... We should return 0 immediately to simplify the flow. Let's do it for if (targets != NULL) also when we're at it

I think we can change it in a next round. regex value is produced by msc_pregcomp(), which returns NULL in case of failure, so I should compare it with a correct type.

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:

if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

airween commented 3 weeks ago

The following code is strictly equivalent to the existing one, except that it removes one 'if' level (return early) and simplify a bit this code that becomes complex to follow:
if (targets == NULL) {
            if (msr->txcfg->debuglog_level >= 9) {
                msr_log(msr, 9, "fetch_target_exception: No exception target found for rule id %s.", rule->actionset->id);
            return 0;
        }

I see your point, yes, that would make the code more cleaner.

airween commented 3 weeks ago

I'm wondering why is a cycle here, I mean target created by split of targets by , (comma) which is a copy of exceptions here.

This would be make sense if the syntax of rule were allowed multiple targets, eg:

   ctl:ruleRemoveTargetById=9XXXXX;ARGS_GET,ARGS_POST,ARGS

but this is not allowed.

Variable exceptions passed when the function is called here, the item build in this line.

@marcstern, @dune73 do you have any idea, what was the original aim of this solution?

marcstern commented 3 weeks ago

Where is it defined that multiple targets is not allowed? Remark: in case it's allowed, it would be quicker to return 1 immediately instead of parsing all targets.

airween commented 3 weeks ago

Where is it defined that multiple targets is not allowed?

How looks like the syntax?

marcstern commented 3 weeks ago

Where is it defined that multiple targets is not allowed? How looks like the syntax? ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

airween commented 3 weeks ago

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:

AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS

The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

marcstern commented 3 weeks ago

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"
The parser does not allow this syntax. I tried this and other similar syntax's with no luck. The error message is:
AH00526: Syntax error on line...
Error parsing actions: Unknown action: ARGS
The , (comma) character is the action separator, even we use a quoted form (I tried to quote the whole argument like you (including rule id), or just quote the targets).

My syntax was indeed incorrect. The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

dune73 commented 3 weeks ago

@airween asked me to chime in. Sorry to be so silent otherwise.

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically. But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

airween commented 3 weeks ago

ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"
My syntax was indeed incorrect. The following is accepted: ctl:ruleRemoveTargetById='942100;ARGS:/^password[\d+]$/,ARGS:abc'"

Sorry, but I don't see any differences.

airween commented 3 weeks ago

I see how the definition of multiple targets on a single runtime rule exclusion within single quotes would work syntactically.

do you think this syntax works? I wasn't able to write any rule with multiple targets.

But I do not think it brings much of an advantage beyond an assumed minimal speed improvement.

I'm not sure this brings any speed improvements. In both cases there are the same cycles.

The rule exclusion becomes harder to read and if you think of a graphical rule editor the UX designer would probably curse you using very harsh words for this addition.

This is a valuable argument on long time, but may be we don't need to care of this issue actually.

Or in other words: Adding a regex to the target definition: Solves an itch where we have no alternative. Adding the option to add multiple arguments to a single ctl statement: Solves a small problem at the cost of hard to maintain config.

Actually regex in targets (with this implementation) would be very-very slow. If there is a real user demand (and based on the comments under original PR there is) I agree we need to implement this feature, but not this way.

The questions are:

do our users want this feature? If yes, we should add it to both engine (and what about Coraza? @jptosso, @fzipi, @jcchavezs?)
what about the multiple targets in one exclusion? Should it work? If yes, with what syntax? (this is important to understand how the current code works and how can we improve the speed of regex targets)

dune73 commented 3 weeks ago

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime. The way it is done is very slow, very often slower than executing the rule itself. That effectively means that disabling rules via rule exclusions for speed gain is not possible. (And that use case is very real: If I assure that an ARGS follows a positive definition, I would like to improve the performance by removing this ARGS from further evaluation, but this approach actually degrades performance for simple ARGS. It's faster to run the rules ...)

And I remember looking at the execution in detail and I think either ModSec v2 or v3 would actually run the rule and remove the result if there was a rule exclusion instead of skipping the execution. But I am no longer 100% sure this was really the case.

But back to this problem here: The situation that pops up regularly is random cookie names and also random or enumerated argument names, thus targets where you do not know the name in advance. For the cookies, there is no alternative to disabling cookie inspection for a rule completely because the (Drupal!) session cookie has a random name.

I think the regex definition of runtime rule exclusions is needed, but the speed should be improved.

airween commented 3 weeks ago

Honestly, I think the speed problem is around the whole concept of runtime rule exclusions and how it is implemented.

yes.

You have a target list and then you remove individual targets from that list based on the ignore-list at runtime.

First I want to understand why the exclusions handling code wants to handle multiple targets, but the parser does not allow it. So it would be very good to see a valid syntax, there the exclusion has multiple targets and parser eats it.

marcstern commented 3 weeks ago

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

marcstern commented 3 weeks ago

I think having a multiple targets is probably (a little bit) quicker than specifying multiple ctl actions.

The main problem is that a lot of parsing is done at run-time and especially the regex compiling. Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

Nota that, for one of the most common cases, excluding a target from all rules (like the Drupal session cookie), there's a much better solution: allowing all collections to be writeable. That way, we can remove it completely. But that's another story (that will come back, I promise).

airween commented 3 weeks ago

ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\d+]$/,ARGS:abc'"

AH00526: Syntax error on line 22 of ....conf:
Error parsing actions: Invalid quoted pair at position 47: ctl:'ruleRemoveTargetById=942100;ARGS:/^password[\\d+]$/,ARGS:abc'"

dune73 commented 3 weeks ago

I agree with @marcstern completely.

And you probably need to give your parser some rest @airween. It's clearly having a bad day.

airween commented 3 weeks ago

Pre-parsing the exceptions at config time and storing each of them in a table (with their pre-compiled regex) would probably enhance the perf a lot. Then, specifying multiple targets would be useless.

This is what I want to solve, but I don't see how multiple targets handling works. I mean I see the code, but I can't write a rule which will be allowed by the config parser.

marcstern commented 3 weeks ago

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version. https://github.com/owasp-modsecurity/ModSecurity/issues/2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

If you don't use a \, it should work: Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

airween commented 3 weeks ago

You're right @airween, the syntax doesn't work ... because of a bug I fixed in my test version.

oh, it's good to know.

2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR?

If you don't use a , it should work: Use SecAction "phase:1,ctl:'ruleRemoveTargetById=942100;ARGS:/^password[0-9+]$/,ARGS:abc'"

I tried without regex format, but it still don't work:

    ...
    ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

the result:

AH00526: Syntax error on line 22 of testconf.conf:
Error parsing actions: Unknown action: "

The config:

$ nl -ba testconf.conf | grep 22
    22  ctl:'ruleRemoveTargetById=2003;ARGS:cba,ARGS:abc'"

marcstern commented 3 weeks ago

2927 is aimed at fixing the processing to fully support regex in actions. Currently, backslashes are posing problems.

So before we continue the work, we should merge this PR? Definitely

owasp-modsecurity / ModSecurity