SecGeoLookupDb /etc/nginx/geoip/GeoLite2-City.mmdb crashes ingress-controller if it cannot be read

lantzemil commented 2 months ago

Describe the bug

If the SecGeoLookupDb /etc/nginx/geoip/GeoLite2-City.mmdb is present in the configuration and if for some reason the .mmdb file is not present, broken or cannot be read, this will stop the ingress-controller from starting, putting it into a reboot loop.

To Reproduce

Using nginx, ingress-controller and modsecurity WAF in Kubernetes.

Setup can be done using minikube and this guide: https://thelinuxnotes.com/index.php/how-to-install-and-configure-modsecurity-waf-in-kubernetes/

Add SecGeoLookupDb /etc/nginx/geoip/GeoLite2-City.mmdb and use-geoip2: "true" in your ConfigMap and apply it. Restart.

Logs and dumps

Run kubectl logs -f --namespace ingress-nginx --selector=app.kubernetes.io/component=controller

The following error can be seen error

kubectl get pods -A

Will show that the pod is not ready and that the ingress-controller is down

Expected behavior

There should be some form of built-in failsafe in this. If the .mmdb file is not found, log an error and disable this functionality, but don't bring down the entire ingress-controller.

airween commented 2 months ago

Hi @lantzemil,

thanks for reporting.

I tried to reproduce this issue in a native environment (not in Ingress controller), but I was not able to do that.

If the database file isn't there, then I get an error message and Nginx won't start:

Mar 07 19:59:36 vm-debiansid nginx[42452]: 2024/03/07 19:59:36 [emerg] 42452#42452: "modsecurity_rules_file" directive Rules error. File: /etc/nginx/modsecurity.conf. Line: 273. Column: 19. Failed to load locate the GeoDB file>
Mar 07 19:59:36 vm-debiansid nginx[42452]: nginx: configuration file /etc/nginx/nginx.conf test failed

How do you control your Nginx process? I mean do you start it through systemd service file? (Sorry for the question, but I don't know Ingress.) May be that forces the reload again and again?

lantzemil commented 2 months ago

Hi @airween

Thanks for your reply and your time. I don't think Nginx function should rely on this thing. Nginx is pretty much needs to function no matter what. Is it possible for you to

a) fix this by implementing some fail-safe, like a try-except pattern or having a hardcoded file somehow

b) propose a temporary fix for this in the meantime? I have had to disable this on all my environments as it's brining down Nginx. I have tried to add a touch mmdb-file hook upon startup but that's pretty flaky and not a nice fix in the long run.

Thanks in advance.

airween commented 2 months ago

I don't think Nginx function should rely on this thing. Nginx is pretty much needs to function no matter what.

I think this is the key.

Lets check two fully independent cases - I mean no systemd service file, init script, etc...

# /usr/sbin/nginx -t
2024/03/08 13:48:39 [emerg] 42939#42939: "modsecurity_rules_file" directive Rules error. File: modsecurity.conf. Line: 273. Column: 19. Failed to load locate the GeoDB file from: /etc/nginx/geoip/GeoLite2-City.mmdb Looking at: '/etc/nginx/geoip/GeoLite2-City.mmdb', '/etc/nginx/geoip/GeoLite2-City.mmdb', 'modsecurity.conf//etc/nginx/geoip/GeoLite2-City.mmdb', 'modsecurity.conf//etc/nginx/geoip/GeoLite2-City.mmdb'.  in /etc/nginx/sites-enabled/default:51
nginx: configuration file /etc/nginx/nginx.conf test failed

# /usr/sbin/nginx -c /etc/nginx/nginx.conf 
2024/03/08 13:50:51 [emerg] 42940#42940: "modsecurity_rules_file" directive Rules error. File: modsecurity.conf. Line: 273. Column: 19. Failed to load locate the GeoDB file from: /etc/nginx/geoip/GeoLite2-City.mmdb Looking at: '/etc/nginx/geoip/GeoLite2-City.mmdb', '/etc/nginx/geoip/GeoLite2-City.mmdb', 'modsecurity.conf//etc/nginx/geoip/GeoLite2-City.mmdb', 'modsecurity.conf//etc/nginx/geoip/GeoLite2-City.mmdb'.  in /etc/nginx/sites-enabled/default:51

I think we can say that the library meets these expectations.

a) fix this by implementing some fail-safe

Sorry for the question, but why does this need to be fixed by some fail-safe solution? Or do you think some other directives need that too? I mean if a SecRule is wrong, then the library must be step over it with a warning message? Or if you set some non-existent path for any logs (debug.log, audit.log), just report it but start the process - and you won't have log files?

lantzemil commented 2 months ago

@airween I see your point. But think about it this way, if a SecRule is wrong, that's because I have written something incorrect, right? I can test that locally first and see that my configuration is incorrect. Easy fix.

The mmdb-file is downloaded upon restart. Restarting the ingress-controller happens in, for instance, QA - what if the daily rate limit has been maxed? Or the service is not reachable? Or the file downloaded is corrupted because of an error? That cannot easily be reproduced locally and creates a somewhat flaky behavior in my opinion. I mean, I can hardcode a file somehow, somewhere, but I don't find this behavior safe.

But then again, if I'm misunderstanding something here, please let me know.

airween commented 2 months ago

The mmdb-file is downloaded upon restart. Restarting the ingress-controller happens in

Wait, above you wrote:

I don't think Nginx function should rely on this thing. Nginx is pretty much needs to function no matter what.

But now you write - if I understand correctly - that it depends on the ingress-controller, right? (I'm not familiar in Ingress nor Kubernetes).

Based on your description, I'm almost sure that you have to solve the listed scenarios (rate limit maxed, service unreachable, corrupt file, ...), eg. first you try to copy the file, and if it success, you can restart the service.

airween commented 2 months ago

One more thought: I assume you use GeoIP database because you have some rules which use GEO operator, right?

Why and how do you expect that GEO to work without GeoIP database? (I haven't analyzed yet - maybe would it work, but would give false information.)

lantzemil commented 2 months ago

But now you write - if I understand correctly - that it depends on the ingress-controller, right? (I'm not familiar in Ingress nor Kubernetes).

Sorry, let me clarify. I mean the Nginx Ingress Controller. https://docs.nginx.com/nginx-ingress-controller/overview/about/

This handles all the traffic to the Kubernetes clusters. Hence if that fails, none of the services are reachable.

Why and how do you expect that GEO to work without GeoIP database? (I haven't analyzed yet - maybe would it work, but would give false information.)

The file itself is essentially a deny-list right? Upon an API request it makes a check in that file and if it's finds a match => it's denied. If it's empty, the worst we could expect is that we do not deny it based on the IP/location etc. Other rules may still catch it. You get the gist of it.

From are reliability perspective, letting a potentially malicious API request through, with errors provided in the logs that the file is not working, is infinite times better than the whole Kubernetes cluster is unavailable.

My only suggestion to you, is that you add some type of try/except functionality.

For instance: try to load file, except I/O Error -> log this issue and load an empty file instead

It seems like you're not really buying my argument here, even though I have raised a valid concern. I think the error handling is poorly implemented. I cannot see how this behavior is safe from a reliability perspective. Otherwise we will just have to implement a workaround ourselves.

airween commented 2 months ago

This handles all the traffic to the Kubernetes clusters. Hence if that fails, none of the services are reachable.

I'm not sure it's relevant from the point of this topic.

The file itself is essentially a deny-list right?

No, it's definitely not.

The file holds geolocation data for IP addresses. An IP address could be a key, and you get several properties of that IP.

Upon an API request it makes a check in that file and if it's finds a match => it's denied.

No, definitely not. You can make rules which use those information (IP -> geo data).

If it's empty, the worst we could expect is that we do not deny it based on the IP/location etc. Other rules may still catch it. You get the gist of it.

Or if it's empty, then the WAF denies something what we do not want to deny.

Btw especially on some servers I use that to decide which PL use the transaction. "Friendly" countries -> lower PL. I wouldn't be happy if I should investigate the false positives...

But never mind, the point is not this.

From are reliability perspective, letting a potentially malicious API request through, with errors provided in the logs that the file is not working, is infinite times better than the whole Kubernetes cluster is unavailable.

I see, then you should solve this problem inside of Kubernetes. Be sure that the GeoIP database is successfully downloaded, if that's not true, don't restart the server.

My only suggestion to you, is that you add some type of try/except functionality.

For instance: try to load file, except I/O Error -> log this issue and load an empty file instead

I'm sure that can be solved outside of libmodsecurity3.

It seems like you're not really buying my argument here, even though I have raised a valid concern.

Please think about this statement again. May be your argument is valid - for yourself. Are you sure every libmodsecurity3 user would be happy if the engine were ignore a configuration directive?

Also please consider this "solution" in case of mod_security2. We try to maintain the compatibility between libmodsecurity3 and mod_security2. So now we should change this behavior in mod_security2 too, because you want to solve that issue in your Ingress system?

I think the error handling is poorly implemented.

I think it's not. It's plenty enough. If the file isn't there or the library can't open, then the engine does not start, and you receive a correct error message.

I cannot see how this behavior is safe from a reliability perspective.

I cannot see why do you want to start an engine if the prerequisites are not correct.

Otherwise we will just have to implement a workaround ourselves.

I think that's a good idea. Please let me know if I can help you anything.

lantzemil commented 2 months ago

I think that's a good idea. Please let me know if I can help you anything.

I think we're fine.

owasp-modsecurity / ModSecurity

SecGeoLookupDb /etc/nginx/geoip/GeoLite2-City.mmdb crashes ingress-controller if it cannot be read #3106