Problems with blocking if response is using compression

azurit commented 3 years ago

Describe the bug

When blocking a request, which uses output compression (for example Content-Encoding: gzip), in phase 4 (so response headers are already created by application), blocking is NOT done properly because Content-Encoding: gzip will already be send to client so it awaits compressed response - on the client side, browsers will show error similar to this (this one is from Firefox): Content Encoding Error

To Reproduce

Problem can be easily reproduced using this PHP script and blocking rule:

<?php
@ob_start("ob_gzHandler");
header('X-test-blocking: block-me');
echo 'test';
?>

SecRule &RESPONSE_HEADERS:X-test-blocking "!@eq 0" \
    "id:99999999999,\
    phase:4,\
    deny"

Expected behavior

modsecurity should remove or properly set the Content-Encoding header if request is blocked.

zimmerle commented 3 years ago

Hi @azurit,

Indeed, once the response headers hit the client, there is no much we can do to change the parameters that were already sent. The best thing to do in that scenario is to block the request abruptly as opposed to respecting the disruptive action informed on the rule.

At a given phase N, there is no way to know that at phase N+1 a rule will match, leading to a block. As phase occurs independently and in sequence, their inst much we can do. The possibility to hold the response headers till response body is processed is often undesired. It leads to significantly increased latency and reduces the server capability in terms of requests per second.

Usually, the out coming from a body inspection that "matches" is not to allow the response to hit the client; in a sense, it is workable. What are you aiming to accomplish?

azurit commented 3 years ago

Hi @zimmerle,

are headers already sent in phase 4? I thought they are still only buffered somewhere.

I'm trying to block web shells which are identified by searching for patterns in resposne body (see https://github.com/coreruleset/coreruleset/pull/1962). When implementing this feature i come accross few web shells which are doing response body compression on purpose to hide themself - and it's working :( i can't catch them and block. So i implemented a solution to prevent this bypass (see https://github.com/coreruleset/coreruleset/pull/1968) which is working too except the ugly error message displayed by web browser (Content Encoding Error).

zimmerle commented 3 years ago

Is it running on a nginx or Apache? which version of ModSecurity are you targeting? and What kind of deployment do you have?

azurit commented 3 years ago

Apache, modsec 2.9.3 (but both mentioned PRs should work also in version 3). What do you mean by 'deployment'?

Anyway, if headers are already sent, we cannot change them. What about to check if client is expecting compression (Content-Encoding header was sent with gzip or similar values) and, if yes, compress our response?

zimmerle commented 3 years ago

The behavior of sending the response headers before inspecting the response body may vary from server to server. The deployment is how ModSecurity is working on your environment (e.g., as a proxy on an independent server, embedded in the app webserver).

The thing to consider is what type of disruptive action you want to take that will lead to the response body's delivery or response body modification. Those are the disruptive actions that we can have -

action: deny (together with status)

It needs to send the response headers with the correct status code (403?). Will generate the error on the web browser.

action: redirect

It needs to send 302 on the response headers. Will generate the error on the web browser.

action: allow

Not applicable in the context

action: drop

Drop and consequently will generate the error that you have mentioned on the web browser.

action: pass

Not applicable in the context

In which condition you foresee a body modification that demands re-compression of the response body content?

As for the suggestion for decompressing the body for inspection, it sounds promising to me. I am wondering if we could have a transformation t:unzip. What do you think?

azurit commented 3 years ago

It's embedded.

I'm using 'deny' action.

In which condition you foresee a body modification that demands re-compression of the response body content?

If headers were already sent to client (i.e. we are in phase 4 using embedded modsecurity) AND they included header Content-Encoding with value gzip, the client is expecting gzipped response. If we decide to deny this request and do NOT send original response body, we need to compress our new response body so client is not confused.

As for the suggestion for decompressing the body for inspection, it sounds promising to me. I am wondering if we could have a transformation t:unzip. What do you think?

This would be sooo cool! I'm really missing this feature. As you can see in PR https://github.com/coreruleset/coreruleset/pull/1968, i implemented something like t:unzip using Lua script but a transformation function will probably be much faster.

zimmerle commented 3 years ago

If headers were already sent to client (i.e. we are in phase 4 using embedded modsecurity) AND they included header Content-Encoding with value gzip, the client is expecting gzipped response. If we decide to deny this request and do NOT send original response body, we need to compress our new response body so client is not confused.

Once the request was disruptive blocked by a deny action, there is no content to be delivered to the user -- a possible malicious player in that case.

This would be sooo cool! I'm really missing this feature. As you can see in PR coreruleset/coreruleset#1968, i implemented something like t:unzip using Lua script but a transformation function will probably be much faster.

Indeed it should be faster and more generic (e.g., available to other variables). One of the ideas is to support Lua is to encourage prototyping; it fits the implementation that you have made :+1:

Looking at your code https://github.com/coreruleset/coreruleset/pull/1968/commits/bfd4d490837c6eccf287a401ff0a6c23c311c5cc#diff-ec25e9162441fd2a294691ac0f3fff833c80e893f436feebd1ea86df2b19ac5d

require("m")
require("zlib")
function main()
local f = zlib.inflate()
local response_body_uncompressed = f(m.getvar("RESPONSE_BODY", "none"))
m.setvar("tx.response_body_uncompressed", response_body_uncompressed)
return nil
end

The c++ variant shouldn't be that different. The annoying part is to understand how to add the Zlib dependency on ModSecurity. Once that is done, the transformation should be easy to implement, something like the example below (not tested, just a draft) -

void GZipDecompress::execute(const Transaction *t, const ModSecString &in, ModSecString &out) noexcept {
    zlibcomplete::GZipDecompressor decompressor;
    std::string ret = decompressor.decompress(std::string(in.c_str(), in.size()));
    out.assign(ret.c_str(), ret.size());
}

Here is an example of the transformation t:phpArgsNames added by @marshal09 few weeks ago - https://github.com/SpiderLabs/ModSecurity/pull/2387/commits/bab97237db2bee7362d6e3fa2cb60bb92af491fa . In this example we don't have an external dependency but gives you an idea of what needs to be made.

Up for the challenge? :) :rocket: :rocket:

azurit commented 3 years ago

Once the request was disruptive blocked by a deny action, there is no content to be delivered to the user -- a possible malicious player in that case.

This isn't always true, HTTP specification is not disallowing content with status 403 (which is the modsecurity default/recommended status for deny action). Even more, default Apache installation is sending also content:

HTTP/1.1 403 Forbidden
Date: Thu, 07 Jan 2021 15:21:13 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access this resource.</p>
</body></html>

This is why browsers are showing Content Encoding Error error - original content was compressed and was replaced by uncompressed one.

zimmerle commented 3 years ago

I may be misreading the circumstances. From what you have described, the request was blocked after the headers being sent to the client. In this scenario, the browser receives a code 200 with headers signaling the response body will be compressed. But, no response body at all - leading to the browser error. That is the expected scenario.

In this last example, you are showing a request for which the error code was 403.

Can you share a tcdump containing the entire HTTP transaction that leads your browser to the error?

azurit commented 3 years ago

Just run PHP script with modsec rule in my original message above, problem is 100% reproducable.

EDIT: Tested with Apache 2.4.

azurit commented 3 years ago

Here is a response from server, which was modified (blocked) by modsecurity, catched by curl (as you can see, content is in clear text, not gzipped but contains header Content-Encoding: gzip [because original content, removed by modsec, was gzipped]):

HTTP/1.1 403 Forbidden
Date: Thu, 07 Jan 2021 17:00:59 GMT
Server: Apache
X-test-blocking: block-me
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>403 Forbidden</title>
</head><body>
<h1>Forbidden</h1>
<p>You don't have permission to access this resource.</p>
</body></html>

This one is also catched by curl but was NOT blocked by modsecurity, so original, gzipped, content was sent (as you can see, content is really binary so curl is complainig - just an example how gzipped content is shown by curl and a proof that content from request above was not gzipped):

HTTP/1.1 200 OK
Date: Thu, 07 Jan 2021 17:01:26 GMT
Server: Apache
Content-Encoding: gzip
Vary: Accept-Encoding
Content-Length: 24
Content-Type: text/html; charset=UTF-8

Warning: Binary output can mess up your terminal. Use "--output -" to tell
Warning: curl to output it to your terminal anyway, or consider "--output
Warning: <FILE>" to save to a file.

azurit commented 3 years ago

Am i beeing more clear?

zimmerle commented 3 years ago

Am i beeing more clear?

The reason why I have asked for the tcpdump is because I want to understand what is happening on your environment. The results of my tests show a different scenario. Using the PHP script that you have posted, that is the out-coming I have -

$ curl -sH 'Accept-encoding: gzip' -L -v -s -o /dev/null  http://localhost/azurit.php\?args\=not
*   Trying ::1:80...
* Connected to localhost (::1) port 80 (#0)
> GET /azurit.php?args=not HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
> Accept-encoding: gzip
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 403 Forbidden
< Date: Thu, 07 Jan 2021 17:24:01 GMT
< Server: Apache/2.4.46 (Unix) PHP/7.4.13
< Vary: accept-language,accept-charset
< Accept-Ranges: bytes
< Transfer-Encoding: chunked
< Content-Type: text/html; charset=utf-8
< Content-Language: en
< 
{ [625 bytes data]
* Connection #0 to host localhost left intact

As for ModSecurity debug log -

[/azurit.php][1] Access denied with code 403 (phase 4). Pattern match "not" at ARGS:args. [file "/etc/httpd/conf/httpd.conf"] [line "574"] [id "2"]

error log -

[client ::1] ModSecurity: Access denied with code 403 (phase 4). Pattern match "not" at ARGS:args. [file "/etc/httpd/conf/httpd.conf"] [line "574"] [id "2"] [hostname "localhost"] [uri "/azurit.php"] [unique_id "X-dB90R-0vy9fpawrycoPQAAAAA"]

Without the rule interference -

$ curl -sH 'Accept-encoding: gzip' -L -v -s -o /dev/null  http://localhost/azurit.php\?args\=yes
*   Trying ::1:80...
* Connected to localhost (::1) port 80 (#0)
> GET /azurit.php?args=yes HTTP/1.1
> Host: localhost
> User-Agent: curl/7.74.0
> Accept: */*
> Accept-encoding: gzip
> 
* Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Date: Thu, 07 Jan 2021 17:24:57 GMT
< Server: Apache/2.4.46 (Unix) PHP/7.4.13
< X-Powered-By: PHP/7.4.13
< X-test-blocking: block-me
< Content-Encoding: gzip
< Vary: Accept-Encoding
< Content-Length: 24
< Content-Type: text/html; charset=UTF-8
< 
{ [24 bytes data]
* Connection #0 to host localhost left intact

In the case of the PHP script, it seems like the headers are being hold till the end of the response inspection.

azurit commented 3 years ago

Please can you recheck that you are doing 'deny' in phase 4? This is important (i see you changed my blocking rule a little so i just want to be sure, thank you).

zimmerle commented 3 years ago

ou changed my blocking rule a little so i just want to be sure, thank you).

It is possible to double check that on the error_log and debug_log - https://github.com/SpiderLabs/ModSecurity/issues/2494#issuecomment-756262456

azurit commented 3 years ago

Sorry, didn't notice that. Hm. In my Apache installation i have mod_deflate enabled - the only thing which i can think of that could be related.

azurit commented 3 years ago

Another difference between my and your installation is a way how PHP is run - you seems to be using mod_php but i'm running it as FastCGI (PHP-FPM).

azurit commented 3 years ago

That's it! After switching to mod_php, problem is gone.

azurit commented 3 years ago

I'm digging into it deeper to find out why FastCGI is doing such a mess. Running PHP using PHP-FPM is much better, has lots of advantages and in fact, we completely moved from mod_php to FPM. I'm trying various Apache configurations and see if there's any difference (but no luck yet).

azurit commented 3 years ago

My findings (based on observing and strace-ing of php-fpm/apache processes):

when using PHP-FPM, if request is blocked by modsecurity, Apache keeps all headers set by PHP without touching them
when using mod_php, if request is blocked by modsecurity, Apache resets the whole response and sets everything, including headers, from scratch (so i even cannot see my header 'X-test-blocking' which was used for blocking)

What do you think about this? Is this problem related to modsecurity or can it be resolved by modsecurity?

zimmerle commented 3 years ago

Congrats! Nice finding. :1st_place_medal:

It seems like in the second case, Apache is using the error page including the page headers. Different from the first case. It could be an issue on PHP-FPM or the order of the modules execution are incorrect. That may justify why you have the _RESPONSEBODY in gzip format.

You may want to re-order the mod_ here -

https://github.com/SpiderLabs/ModSecurity/blob/12cefbd70f2aab802e1bff53c50786f3b8b89359/apache2/mod_security2.c#L1649-L1687

zimmerle commented 3 years ago

@azurit any progress on this ?

azurit commented 3 years ago

@zimmerle sorry for late response! I didn't have much time to look into this, i was only trying few changes in modsecurity source code, none of them helped (but i was little lost in it as i'm not C coder). Also, i'm not sure if this is a problem of modsec, maybe Apache itself causes it.

zimmerle commented 3 years ago

@zimmerle sorry for late response! I didn't have much time to look into this, i was only trying few changes in modsecurity source code, none of them helped (but i was little lost in it as i'm not C coder). Also, i'm not sure if this is a problem of modsec, maybe Apache itself causes it.

No worries. If you want discuss something on the implementation, feel free to reach me via email. I have limited time, but I always respond.

All modules work orchestrated in a chain of events inside Apache. All modules and Apache itself makes some assumptions that sometimes are undesired to other modules. The investigation could be very time intensive.

zimmerle commented 3 years ago

Linking this issue with #944.

owasp-modsecurity / ModSecurity