Remove MIME Attribute from application/soap+xml Rule 900220

rsbrisci commented 4 years ago

The current logic only validates Mime Type (application) and Sub-Type (soap).

The Mime Attribute in application/soap+xml prevents the following Content-Types from matching: application/soap application/soap+xml

Whereas the value application/soap will allow both: application/soap application/soap+xml

franbuehler commented 4 years ago

Thanks for this PR @rsbrisci. It is a different approach than I had prepared in my PR. But my PR with the operator @within led to a failed regression test, a new false negative. This means that I would have to add delimiters as described at https://github.com/SpiderLabs/ModSecurity/wiki/Reference-Manual-(v2.x)#within.

So your approach in this PR is easier. But it's also less strict! But ok for me. Other opinions welcome.

You only changed the variable tx.allowed_request_content_type in the commented out part in crs-setup.conf.example. This has not effect.

Could you please also change it in the https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.3/dev/rules/REQUEST-901-INITIALIZATION.conf#L171 where the variable is actually set.

Please also update the default comment in crs-setup.conf.example: https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.3/dev/crs-setup.conf.example#L392

And could you please also add a regression test for the Content-Type: application/soap+xml; charset=UTF-8. Add the following code snippet to https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.3/dev/tests/regression/tests/REQUEST-920-PROTOCOL-ENFORCEMENT/920420.yaml:

    -
      test_title: 920420-10
      stages:
        -
          stage:
            input:
              dest_addr: "127.0.0.1"
              port: 80
              method: "OPTIONS"
              headers:
                  User-Agent: "ModSecurity CRS 3 Tests"
                  Host: "localhost"
                  Content-Type: "application/soap+xml; charset=UTF-8"
              data: "test"
            output:
              no_log_contains: "id \"920420\""

fgsch commented 4 years ago

Personally I think we should be using strict matching here, relaxing a rule that is meant to restrict what Content-Types are allowed doesn't ring right to me. If you indeed want to allow both, why not simply include application/soap+xml and application/soap (in that order)?

Also, I just realised that rule 920420 should be using parenthesis around the pattern, otherwise things such as multipart/form-data-something will be allowed as well.

franbuehler commented 4 years ago

Yeah, I was also thinking of that!

The problem we are trying to resolve is to get the + working for Apache and nginx. So rsbrisci just removed that part.

But that means that I will pursue my PR with the @\within, solve my false negative and push my PR asap. After that we can close this PR.

fgsch commented 4 years ago

I think that'd be best but may I suggest using @\pm instead? From reading the ModSecurity code for v2 and v3, and the docs, it should support | as delimiter, making it backward compatible.

jeremyjpj0916 commented 4 years ago

Also, I just realised that rule 920420 should be using parenthesis around the pattern, otherwise things such as multipart/form-data-something will be allowed as well.

Yeah without this change I believe my colleagues PR would make the most sense because as the rule stands now you can add -whatever or +whatever to any of these common Content-Types.

If going more strict about it, then application/soap+xml makes sense.

Word of warning about being strict though, think about all those attributes: https://www.iana.org/assignments/media-types/media-types.xhtml ,

also what about the case of charsets set in the Content-Type ? aka application/json; charset=utf-8 , going strict will cause this rule to break many existing apps setting charsets eh?

rsbrisci commented 4 years ago

@fgsch @franbuehler wow! I love how active your community is.

I kinda figured you'd want to go with strict matching - and I agree, that's probably a better approach. My colleague @jeremyjpj0916 called it yesterday, but I thought "ahh heck, might as well try".

This is especially true considering I don't think the Type/Subtype application/soap is actually valid without the attribute +xml.

rsbrisci commented 4 years ago

I could be wrong @jeremyjpj0916 - but I think HTTP1.1 treats charsets a lot like ports. If not specified, the default is used ISO-8859-1. Should it really be part of MIME validation?

edit: And/also isn't there a separate rule which already validates charsets? I think I've helped a customer troubleshoot that already :P

jeremyjpj0916 commented 4 years ago

Ahh I bet under the hood when Content-Type header is getting parsed out, for this rule its taking everything before the first ;(if present) so splitting the charset out when needed, because there are separate rules validating charset= in CRS as well, so likely this rule doesn't just blind read the whole Content-Type header and validate against that so going strict matching should not be a problem if thats the case.

rsbrisci commented 4 years ago

Pretty sure that's how it works - the experts can confirm.

Bringing us back to the topic at hand, I'm not sure this PR is the right path forward. Strict validation seems to be the way to go

This is especially true considering I don't think the Type/Subtype application/soap is actually valid without the attribute +xml.

I'd be happy to try my hand at a strict validation PR if you give me some hints to get started @fgsch @franbuehler , though my capacity to test on apache will be limited.

fgsch commented 4 years ago

I think that'd be best but may I suggest using @\pm instead? [..]

Actually forget about this. @\pm will do partial matching.

fgsch commented 4 years ago

Charset is handled in another rule, correct.

As for working on this, as she wrote above @franbuehler is working on a PR so you might want to coordinate with her.

franbuehler commented 4 years ago

I changed the var tx.allowed_request_content_type so that it uses delimiters https://github.com/franbuehler/owasp-modsecurity-crs/blob/900200-soap-xml/rules/REQUEST-901-INITIALIZATION.conf#L171. AND I changed the rule 920420 so that it uses the @within operator and the defined delimiters: https://github.com/franbuehler/owasp-modsecurity-crs/blob/900200-soap-xml/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf#L967

This works very well for Apache. The regression tests work, even my new regression tests that I added. But again, it does not work for NGINX. "Doesn't work" means, the rules are loaded, no error, but the requests are not blocked. No bad content-types are blocked.

I'm a bit lost and I'm losing patience... I will return to this PR/problem later. Maybe someone else has a suggestion/idea? Or we just shorten the value to application/soap like this PR suggests...?

As a reminder: The problem is, that the value application/soap+xml in the variable tx.allowed_request_content_type is correctly checked with the operator @rx in the rule https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.3/dev/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf#L968 only for Apache, not for NGINX.

Problem description from slack channel:

Did you ever stumble over this FP on NGINX only??

curl -vH "Content-Type: application/soap+xml" -d @payload localhost

2020/03/09 09:54:02 [info] 15590#15590: *1 ModSecurity: Warning. Matched "OperatorRx' with parameter ^application/x-www-form-urlencoded|multipart/form-data|text/xml|application/xml|application/soap+xml|application/x-amf|application/json|application/octet-stream|application/csp-report|application/xss- (26 characters omitted)' against variableTX:0' (Value: application/soap+xml' ) [file "/.../rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf"] [line "894"] [id "920420"] [rev ""] [msg "Request content type is not allowed by policy"] [data "application/soap+xml"] [severity "2"] [ver "OWASP_CRS/3.2.0"] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-protocol"] [tag "OWASP_CRS"] [tag "OWASP_CRS/POLICY/CONTENT_TYPE_NOT_ALLOWED"] [tag "WASCTC/WASC-20"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/EE2"] [tag "PCI/12.1"] [hostname "127.0.0.1"] [uri "/"] [unique_id "158374764270.974560"] [ref "o0,20v82,20"], client: 127.0.0.1, server: localhost, request: "POST / HTTP/1.1", host: "localhost"

The variable in rule 901162 has to be ...|application/soap\+xml|... instead of application/soap+xml.For NGINX I have to escape the + sign. For Apache this escape throws an error during Apache startup.This is a strange behaviour / difference of/between Apache/nginx. Is this a known problem??

airween commented 4 years ago

As a reminder: The problem is, that the value application/soap+xml in the variable tx.allowed_request_content_type is correctly checked with the operator @rx in the rule https://github.com/SpiderLabs/owasp-modsecurity-crs/blob/v3.3/dev/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf#L968 only for Apache, not for NGINX.

I can confirm partially this bug - the only note is that it's not Nginx issue, but the v3. I've made a regression test for v3, and when I changed the application/soap+xml by application/soap\+xml (double escaped in test file for JSON parser), it works with your test cases.

In your solution you try to match the transaction variable with regex:

SecRule TX:/^CONTENT_TYPE_/

but this is a known bug in v3: the regex's are case sensitive everywhere, including transaction variables. (As the documentation says these are case insensitive.) You can use the form with lowercase:

SecRule TX:/^content_type_/

(IMHO this is the preferred), or put a modifier:

SecRule TX:/(?i)^CONTENT_TYPE_/

See the issue #1741 and PR #1745. ModSecurity 3 also has a PR (2297 at there) which fixes this bug.

Hope this helps.

airween commented 4 years ago

Note, that I checked your modifications with ftwrunner, but changed the TX variable to lowercase. Looks like all test passed :).

$ ./ftwrunner -c ftwrunner_franbuehler.yaml -r 920420   
920420-1: PASSED
920420-2: PASSED
920420-3: PASSED
920420-4: PASSED
920420-5: PASSED
920420-6: PASSED
920420-7: PASSED
920420-8: PASSED
920420-9: PASSED
920420-10: PASSED
920420-11: PASSED

SUMMARY:
===============================
PASSED:                   11
FAILED:                   0
FAILED (whitelisted):     0
SKIPPED:                  0
===============================
TOTAL:                    11
===============================

franbuehler commented 4 years ago

Wow, thanks a loooot, @airween !!!! I'm so glad you checked my changes and that you found a solution! I'll change the var to lowercase and propose a PR then! Thank you again!!

airween commented 4 years ago

Just one more note: is it necessary to add a concatenated TX variable, and match it as regex? I mean this solution is more clear, and I think it works as well too:

diff --git a/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf b/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf
index 5098498..e7b9d81 100644
--- a/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf
+++ b/rules/REQUEST-920-PROTOCOL-ENFORCEMENT.conf
@@ -964,9 +964,9 @@ SecRule REQUEST_HEADERS:Content-Type "@rx ^[^;\s]+" \
     tag:'PCI/12.1',\
     ver:'OWASP_CRS/3.2.0',\
     severity:'CRITICAL',\
-    setvar:'tx.content_type_%{tx.0}=|%{tx.0}|',\
+    setvar:'tx.content_type=|%{tx.0}|',\
     chain"
-    SecRule TX:/^CONTENT_TYPE_/ "!@within %{tx.allowed_request_content_type}" \
+    SecRule TX:content_type "!@within %{tx.allowed_request_content_type}" \
         "setvar:'tx.anomaly_score_pl1=+%{tx.critical_anomaly_score}'"

(I'm still on your branch, the diff comes from there.)

Also a small note, but I think you have to change the tx.allowed_request_content_type concatenation (add the extra space) in the exclude rules:

diff --git a/rules/REQUEST-903.9003-NEXTCLOUD-EXCLUSION-RULES.conf b/rules/REQUEST-903.9003-NEXTCLOUD-EXCLUSION-RULES.conf
index e1dca51..01b2a3e 100644
--- a/rules/REQUEST-903.9003-NEXTCLOUD-EXCLUSION-RULES.conf
+++ b/rules/REQUEST-903.9003-NEXTCLOUD-EXCLUSION-RULES.conf
@@ -98,7 +98,7 @@ SecRule REQUEST_FILENAME "@contains /remote.php/dav/files/" \
     pass,\
     t:none,\
     nolog,\
-    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type}|text/vcard'"
+    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type} |text/vcard'"

 # Allow the data type 'application/octet-stream'

@@ -110,7 +110,7 @@ SecRule REQUEST_METHOD "@rx ^(?:PUT|MOVE)$" \
     nolog,\
     chain"
     SecRule REQUEST_FILENAME "@rx /remote\.php/dav/(?:files|uploads)/" \
-        "setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type}|application/octet-stream'"
+        "setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type} |application/octet-stream'"

 # Allow data types like video/mp4

@@ -260,7 +260,7 @@ SecRule REQUEST_FILENAME "@contains /remote.php/dav/addressbooks/" \
     pass,\
     t:none,\
     nolog,\
-    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type}|text/vcard'"
+    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type} |text/vcard'"

 # [ Calendar ]
@@ -273,7 +273,7 @@ SecRule REQUEST_FILENAME "@contains /remote.php/dav/calendars/" \
     pass,\
     t:none,\
     nolog,\
-    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type}|text/calendar'"
+    setvar:'tx.allowed_request_content_type=%{tx.allowed_request_content_type} |text/calendar'"

 # [ Notes ]

What do you think?

franbuehler commented 4 years ago

Oh, yes, absolutely! You're right. I will change that!

franbuehler commented 4 years ago

I will close this PR because I opened a new one: #1748. Thanks for this PR @rsbrisci.

SpiderLabs / owasp-modsecurity-crs

Remove MIME Attribute from application/soap+xml Rule 900220 #1717