owasp-modsecurity / ModSecurity-nginx

ModSecurity v3 Nginx Connector
Apache License 2.0
1.58k stars 283 forks source link

ModSecurity False Positive with respect to logging in ModSec audit + NGX error logs due to NGINX behavior with error_page directive #182

Open jeremyjpj0916 opened 4 years ago

jeremyjpj0916 commented 4 years ago

Decided to pull out my comment to a separate issue since its unrelated to the other issue I raised originally:

I was trying to limit the relevant to 400's and 403's for capturing blocked transactions by ModSecurity:

SecAuditLogRelevantStatus "^(?:4((00)|(03)))"

Yet I see the audit log in RelevantOnly mode still also capturing 504 status messages as well for some reason.

SO this one is an interesting case. I am 100% sure on why 504's were making it into the audit logs even through its regex was set to 400's and 403's because I saw odd behavior in other apps too that hook into NGX with logged values not matching what the client really sent.

Firstly take a gander here(the app we run on nginx + openresty), leveraging the error_page directive: https://github.com/Kong/kong/blob/3.7.0/kong/templates/nginx_kong.lua#L94

Then take a gander here at how Kong or OpenResty can be patched for such situations: Kong/kong#5073 openresty/openresty#396

So what happens is when NGINX does an internal redirect via error_page, and the CTX of the client HTTP METHOD gets reset to GET always, so the NGINX ModSec audit/error logs are thinking REAL 504 timeout transactions are now a GET with a HTTP body(client sends req body) in the audit log logic(The Method the client actually used in the originating request was a POST), so it saves an blocked error log w 403 in the audit log file section H, for the transaction BUT the real transaction was a 504 and it was presented to the client as a 504, just internal logging and record keeping Modsecurity-nginx is not handling these scenarios where error_page directive gets used by folks leveraging the NGINX web server.

The fix could easily be if yall added a custom mod security http method variable to the tx context in the pre-access phase taking the original r->method to track the Method before the error_page directive internal redirect and read that custom mod sec http method variable after in the audit/error log phase logic after the redirect has occurred as the source of truth rather than the current ngx contexts r->method(which would be reset to always be GET at that time). Will save on the noisy invalid logs.

Ex of current behavior(Upstream times out in reverse proxy call, we hit error_page directive, http method gets reset to GET internally, triggering ModSec to log this tx as a block in error + audit log incorrectly when it was really just a harmless timeout):

2020/02/20 07:47:40 [error] 71#0: *65758 upstream timed out (110: Operation timed out) while connecting to upstream, client: 10.96.16.138, server: kong, request: "POST /api/memleak HTTP/1.1", upstream: "http://10.86.228.171:9393/ ", host: "gateway.company.com"

2020/02/20 07:47:40 [error] 71#0: *65758 [client 10.96.16.138] ModSecurity: Access denied with code 403 (phase 2). Matched "Operator Ge' with parameter5' against variable TX:ANOMALY_SCORE' (Value:5' ) [file "/usr/local/owasp-modsecurity-crs-3.2.0/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "79"] [id "949110"] [rev ""] [msg "Inbound Anomaly Score Exceeded (Total Score: 5)"] [data ""] [severity "2"] [ver ""] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "10.129.45.91"] [uri "/api/memleak"] [unique_id "158218486097.459274"] ...

ModSecurity-nginx: latest master branch

jeremyjpj0916 commented 4 years ago

Hmm actually looks like if we took this specific snippit:

https://github.com/SpiderLabs/ModSecurity-nginx/pull/165/files#diff-6fc03fd6024a7ec57f5c46d276d27548R44

Then even executing the log phase logic of modsec would get avoided and save the incorrect logging. Didn't realize we could just take r->error_page . Will give this a test!

jeremyjpj0916 commented 4 years ago

^ So following up on that test, seems to add that does indeed prevent the audit log from being written with the incorrect information. But ngx error output still receives the bad log saying we 403'ed but we really didn't since this was a tcp connection timeout on proxy and was sent to client as 504:

2020/02/20 17:20:04 [error] 72#0: *5222 [client 10.88.17.165] ModSecurity: Access denied with code 403 (phase 2). Matched "Operator Ge' with parameter5' against variable TX:ANOMALY_SCORE' (Value:5' ) [file "/usr/local/owasp-modsecurity-crs-3.2.0/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "79"] [id "949110"] [rev ""] [msg "Inbound Anomaly Score Exceeded (Total Score: 5)"] [data ""] [severity "2"] [ver ""] [maturity "0"] [accuracy "0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "10.131.49.10"] [uri "/api/memleak"] [unique_id "158221920441.697197"] [ref ""], client: 10.88.17.165, server: kong, request: "POST /api/memleak HTTP/1.1", upstream: "http://10.86.228.171:9393/ ", host: "gateway.company.com"

Which must actually be storing this message in an earlier phase prior to the error_page directive because the request method: POST is logged correctly in the Mod Sec message. What phase/code is executing printing this ^ message? It HAS to be after the error_page directive has messed things up, so maybe the rewrite or header filter phase? Still digging and testing to confirm!

EDIT - Actually on re-evaluation doing this also prevents valid blocking tx's making it into the audit_log too. So can't use this technique alone as it just straight up prevents audit logs seemingly for us.

jeremyjpj0916 commented 4 years ago

Well honestly less to change if we just do it here, I see no major drawbacks, ModSecurity should really not intervene on error_page directive paths for internal redirects at all (http://nginx.org/en/docs/http/ngx_http_core_module.html#error_page) :

ngx_http_modsecurity_process_intervention (Transaction *transaction, ngx_http_request_t *r)
{
    char *log = NULL;
    ModSecurityIntervention intervention;
    intervention.status = 200;
    intervention.url = NULL;
    intervention.log = NULL;
    intervention.disruptive = 0;

    dd("processing intervention");

    /*
     Do not intervene on error_page directive
     because it resets HTTP Method to GET and 
     causes incorrect audit/error logging.
    */
    if (r->error_page) {
       dd("ignore error_page redirect tx");
       return 0;
    }

Edit - this does not work either...

jeremyjpj0916 commented 4 years ago

Ugh, and then the above code stops the invalid ngx error logging BUT it also prevents me from getting valid audit logs too. The std out still has the valid error logs though. Still digging for what can be done to get the best of both worlds here.

jeremyjpj0916 commented 4 years ago

Yeah trying to do hacky checks around short circuiting with the r->error_page check in various phases makes this difficult, I can't seem to satisfy all constraints(make ngx error logs and modsec audit logs correct and log for error_page directive situations). I am unsure how early modsec has captured the request headers or uri and if they are readily available in the src of this module, but the false positives could be totally remedied if we do this sudo code in the phase executions:

 /*
We know the http method and URI are actually altered in error_page internal redirects so lets 
grab them from from the modsec context of the stored known request during either the pre-access or rewrite phase. Overriding what the current error_page ctx wrongly has set to prior known saved values for this tx. 
*/

if(r->error_page) {
  r->method = ctx->modsec_transaction->request->method
  r->uri = ctx->modsec_transaction->request->uri
}

Reference: http://nginx.org/en/docs/http/ngx_http_core_module.html#error_page

@victorhora @defanator looking at past PR's you both have been playing with trying to make this Module more friendly with the error_pages directive behavior, maybe you know if the above sudo code is already somehow able to be referenced in a different format, if so I will just give it a test if you have any snippits/ideas to try and I could PR it if it works out after testing.

martinhsv commented 4 years ago

Hi @jeremyjpj0916 ,

Re:

... ModSecurity should really not intervene on error_page directive paths for internal redirects at all ...

One option is to simply turn off ModSecurity for the error page. E.g.

    error_page   403  /403.html;
    location = /403.html {
        modsecurity off;
        root   /usr/share/nginx/html;
    }
jeremyjpj0916 commented 4 years ago

@martinhsv That may work, but lets say that mod security itself is going to block a tx upon request and sets a 400/403 response and error_page is set to handle 400's as well. Will we then also just lose the tx error and or audit log entry for 400 response blocked transactions all together? I will run a quick test to confirm(I fear we lose either the audit and/or error log record) but it is a good idea I hadn't thought of.

jeremyjpj0916 commented 4 years ago

@martinhsv Took a nap but got around to testing your idea there. Yep the behavior seen when attempting such a suggestion is the error log does still work on a blocked HTTPStatus 400 tx, but audit logs on the transaction are missing(which is vital for our debugging purposes in our use case), which means that audit logging is taking place in the nginx log phase running after the error_page directive internal redirection has occurred.

My test to prove it out was a simple invalid JSON payload that would not parse properly.

martinhsv commented 4 years ago

Hi @jeremyjpj0916 ,

Ah, yes, I guess you're probably hitting https://github.com/SpiderLabs/ModSecurity/issues/2237 .

A couple of other options that might be useful as workarounds leave modsecurity on in the error location, but remove the rules causing the false positive (I tried these with the most recent ModSecurity-nginx in github)

    error_page   403  /403.html;
    location = /403.html {
        modsecurity_rules '
        SecRuleRemoveById 920170
';
        root   /usr/share/nginx/html;
    }

In this case you may get the audit log output, as well as a double report in nginx's error.log. (Note that there are some nuances, for example, this presupposes that the original detection was, in fact, not against 920170.)

A second option could be to disable rules more broadly:

    error_page   403  /403.html;
    location = /403.html {
        modsecurity_rules '
        SecRuleRemoveById 1000-999999
';
        root   /usr/share/nginx/html;
    }

With this option you would only get the single report in error.log. You would also get some output in audit_log, but it looks like the 'messages' section (i.e. section 'H') is empty. This might be acceptable if one is ok with having audit_log output as a cue to cross-reference against the content in error.log

I realize neither of these options are ideal, but I thought I'd mention them as possible workarounds, if they happen to be useful for your needs.

As yet another alternative, you could consider attempting strategy similar to one recently posted by @theseion here: https://github.com/SpiderLabs/ModSecurity-nginx/issues/152

jeremyjpj0916 commented 4 years ago

@martinhsv Yeah had not given that too much thought but I could remove rule by id for example on the rule triggered by seeing GET with HTTP Body in the error log which would remove most of the noisy the false positives because error_log always sets method to GET, but then you also would be losing the valid blocked audit logs for folks who actually used GET but sent a HTTP Body payload too (some API clients make these mistakes heh). So its kinda a two way street with no easy workaround that actually prevents losing legitimate logs and stopping the noise. The fix I think still is simply prior to error_page redirect preserving the original request HTTP Method and request URI detected(hopefully mod security can save this in the tx context somehow during the access phase) and then we just in this connector do something like:

if(r->error_page) {
  r->method = ctx->modsec_transaction->request->method
  r->uri = ctx->modsec_transaction->request->uri
}

As mentioned in earlier post above.

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

jeremyjpj0916 commented 4 years ago

Still relevant.

victorhora commented 4 years ago

The "nostale" tag has been set for this one too @jeremyjpj0916. We'll get to it when possible. Thank you.

hator commented 4 years ago

Hi, @martinhsv thanks for providing those workarounds as I am running into the same issue.

I think I found one more way that might help.

It is to use a named location for the error page instead of a URL, like so:

error_page 403 @error;
location @error {
   ...
}

Then NGINX won't change the request method to GET and full audit log will appear with correct messages in part H.

If there is no need to change URI and method during internal redirection it is possible to pass error processing into a named location: [...] (http://nginx.org/en/docs/http/ngx_http_core_module.html#error_page)

EDIT: I just noticed the reference to https://github.com/SpiderLabs/ModSecurity-nginx/issues/152#issuecomment-593259386 which seems to use the same solution (although in a little more complicated way)

jeremyjpj0916 commented 4 years ago

@hator So to talk to your specific idea could we use this one for example?

https://github.com/Kong/kong/blob/master/kong/templates/nginx_kong.lua

Where I just replace here: https://github.com/Kong/kong/blob/master/kong/templates/nginx_kong.lua#L82 and https://github.com/Kong/kong/blob/master/kong/templates/nginx_kong.lua#L83

with

error_page 400 404 408 411 412 413 414 417 494 @kong_error_handler;
error_page 500 502 503 504 @kong_error_handler;

And then replace:

https://github.com/Kong/kong/blob/master/kong/templates/nginx_kong.lua#L234 With:

location @kong_error_handler {

And that will keep NGINX from resetting the HTTP Methods and such that comes with error_page redirection woes?

If so that is awesome and I am glad we got a fix in mind here. Kong the app I use itself wrote some logic that helped workaround these issues but it still was hitting me with ModSecurity. If so I think no fix needed since its really an nginx behavior issue, but it would be good if we could document this somewhere outside of this issue, just not sure where in ModSecurity it should live.

hator commented 4 years ago

@jeremyjpj0916 Using a named location for 400 error page has an unfortunate effect that it will not work, when handling an invalid HTTP method. It will display a stock NGINX error page with 500 error. This is due no method being set by the request (because it is invalid) and by the error page -- the same reason why it solves this problem with modsecurity. You can try that yourself by doing curl -Xfoo http://<yourserver>/.

The configuration I use has two locations for error handling:

error_page 400 /error;
error_page 401 402 ... 500 501 ... @error2;

location /error {
   modsecurity off;
   # custom error page 
}
location @error2 {
  # custom error page
}

I haven't yet figured out why I have to disable modsecurity in when handling 400, but it would cause the same problem as using a named location, i.e. 500 and stock error page, even when no rules are blocking the request.

Sorry for the offtopic, this discussion should probably move elsewhere :)

jeremyjpj0916 commented 4 years ago

Okay, so 400 HTTPStatus is problematic still, but your solution should help for 403's and 504's my Kong instance treats as a connection/read timeout's going into the error_page redirect causing some audit logging false positives. Appreciate the insight.

EDIT - Had to re-evaluate the named locations for error_page directive. It also flops for 414 URI too long cases. Still need an real foolproof solution here(ModSec saving the original HTTP Method and URI prior to redirect in the NGX Module in an early phase). See here: https://github.com/Kong/kong/issues/5881#issuecomment-627006482

jeremyjpj0916 commented 4 years ago

Okay starting from the top again I recognize the issue is error_page resetting the Originating HTTP Method Header being sent by the client from whatever they had set (POST PUT DELETE PATCH OPTIONS) etc. to GET.

So the solution to our problem is how do we store and persist the original client HTTP Method Header that was sent?

I was chatting with @airween on this problem and he had a smart observation that in the rewrite phase: http://www.nginxguts.com/phases/ , which comes prior to proxying and error_page redirection that we could grab the original method, as its set here:

https://github.com/SpiderLabs/ModSecurity-nginx/blob/master/src/ngx_http_modsecurity_rewrite.c#L145

Now I am willing to bet that when internal redirection via error_page occurs and we hit this rewrite phase a second time is when its setting to GET here. So the question is how in the code we can add something like:

//SAFE to trust this value as its the originating client HTTP REQUEST HEADER when not error_page
if(!r->error_page){
   const char *n_method = ngx_str_to_char(r->method_name, r->pool);
  //HOW to add a transaction persistent context value here to store STORED_CTX_HTTP_REQUEST_METHOD_HEADER_VALUE???
}
else {  //IF ERROR_PAGE, then we need to grab the original stored CTX request http method header value.
  const char *n_method = ngx_str_to_char(STORED_CTX_HTTP_REQUEST_METHOD_HEADER_VALUE); 
}

cc @victorhora @martinhsv @zimmerle ?

Wish I was more familiar with building ngx modules in C as this seems a simple task yet I don't see how to set and persist something to access it later easily in sample codes I am browsing.

Added a stack overflow post too to see if open source community there runs into any ideas: https://stackoverflow.com/questions/61805876/nginx-module-code-error-page-directive-problems

jeremyjpj0916 commented 4 years ago

Wallarm CEO Ivan made the comments:

Thanks for sharing! We solved some similar problems while developing Wallarm Nginx module. The short answer is yes, the wrong Nginx phase was used here. The reason is simple - this is a ModSecurity port from Apache, where Nginx phases are irrelevant.

I'll ask our development team if we can help with the patch.

Sooo, I have some ideas on how to fix that. There are at least three ways you may try to use to pass some request data through Nginx call flow: – Ivan Wallarm 3 hours ago 

3. Store your data as a request's cleanup handler data. Just create a cleanup handler with ngx_pool_cleanup_add(). You may use the cleanup function callback as a unique key and the data as your data associated with the key. – Ivan Wallarm 3 hours ago

2. If Nginx resets modules records data (which it does in some cases like redirects), you may use a trick with Nginx variables. A module may register some variable with ngx_http_add_variable(), then set & get the value with ngx_http_get_variable() – Ivan Wallarm 3 hours ago

1. Store the data in a mod_security data record attached to the request: ngx_http_modsecurity_ctx_t – Ivan Wallarm 3 hours ago
jeremyjpj0916 commented 4 years ago

I think I had an epiphany on this today in my ponderings since is the last step in getting ModSecurity Nginx "productionalized" for us.

Rather than write the above code options. ^ I think we can look at this past PR change around relevant stuff,

https://github.com/SpiderLabs/ModSecurity-nginx/commit/ce1d438bc665c28853eb5b429aad515d42027919

This was good to get log phase still actively doing its thing even during an error_page redirect, I think the mistake with this PR was deciding to allow the request "phases" to re-process data, I think the state prior to this PR merge was good logic, the log phase was all that really needed edited. Testing this theory now, I imagine modsec tx context maintains the request data seen befroe the error_page redirect.

jeremyjpj0916 commented 4 years ago

Added my own PR as the solution to my stack overflow too: https://stackoverflow.com/questions/61805876/nginx-module-code-error-page-directive-problems/61950152#61950152

https://github.com/SpiderLabs/ModSecurity-nginx/pull/204

github-actions[bot] commented 4 years ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days

jeremyjpj0916 commented 4 years ago

Stahp closing thingsssssss

martinhsv commented 4 years ago

Given the 'nostale' tag, this item should not have been closed by the bot.

jeremyjpj0916 commented 4 years ago

indeed, yalls bot is misbehaving.

defanator commented 3 years ago

@jeremyjpj0916 thanks for digging into this! I was trying to compose a minimal test for nginx connector to demonstrate the issue without having entire CRS ruleset included and enabled, and I'm still having troubles to do that.

I was also checking another issue (https://github.com/SpiderLabs/ModSecurity-nginx/pull/90) and noticed that in current connector code there may be scenarios when audit log entries do not appear (while they definitely should) if error_page redirect was made. That PR includes relevant tests - I'm just trying to get something similar to ensure that #204 will not interfere with #90.

Could you please assist in making a minimal reproducible configuration that demonstrates an issue reported here and addressed in #204? TIA!

defanator commented 3 years ago

@jeremyjpj0916 could you please check that https://github.com/SpiderLabs/ModSecurity-nginx/commit/4f26b48998db5119ca818b0909b6b14e08ebb544 resolves at least some of the issues you've been observing? TIA.

jeremyjpj0916 commented 3 years ago

If I get some time I will take a look but no promises as my current gig keeps me busy, been over a year since I was messing with this stuff and have long had workarounds I probably won't be touching again, as this modsecurity tooling in nginx ended up not being performant enough to realistically serve large volumes of traffic so we isolated it to a really small use case we don't spend much time on nowadays.