nginxinc / docker-nginx

Official NGINX Dockerfiles
BSD 2-Clause "Simplified" License
3.26k stars 1.73k forks source link

Don't redirect error_log to stderr #940

Open ViliusS opened 6 days ago

ViliusS commented 6 days ago

Describe the bug

According to https://nginx.org/en/docs/ngx_core_module.html#error_log default error_log level for Nginx should be error, but official docker image is built with level notice. This produces all sorts of issues in logging/monitoring systems. Most often, just graceful ordinary restart produces errors in logs (and then alerts).

Image

To reproduce

Steps to reproduce the behavior:

  1. Run the NGINX Docker image.
  2. Check /var/log/nginx/error.log for startup notices (which are not actually errors).

Expected behavior

error_log should be set to error or maybe warn level, definitely not notice.

Your environment

oxpa commented 3 days ago

@ViliusS you are free to change the error log level to any value you prefer.

Though as you can spot from your screenshot nginx logs signals and processes states at a notice log level. Over the years this information has proven itself to be invaluable when it comes to solving issues with nginx. And so we decided to lower logs severity from error to notice in nginx packages. It is indeed differ from the default in sources though. But I would recommend going with the notice log level whenever possible.

ViliusS commented 3 days ago

OK, thinking about this further, it is a bug in a way error log is redirected to stderr then. If error_log in nginx is not really for errors but for all sorts of notifications and signals one cannot just forward that to stderr. It's due to this redirection logging/monitoring systems then gives false positive alerts. Either error_log needs to be redirected to stdout (and then parsing of severity levels is left for logging system), OR error_log needs to log only errors. One way or another, current configuration is incorrect.

thresheek commented 3 days ago

I'm not entirely following the logic here - why do you think it should not be forwarded to a separate fd at all?

ViliusS commented 3 days ago

I don't think it should not be forwarded at all. I'm saying it should not be forwarded specifically to stderr. stderr is meant to contain errors, and assuming that, container orchestrators on cloud environments (mainly GKE, AKS) forward strerr to monitoring subsystem with severity = error (you can see this in my screenshot). In result, every restart of nginx container under GKE produces 15 error alerts.

thresheek commented 3 days ago

I mean, it's best to have more information on what's going on than less information. This is why we have it set to something that is verbose but still not spammy.

We definitely won't be changing forwarding error logs to something else other than stderr by default, since it's the expected place for log collections since the inception of this image and systems already depend on this behaviour.

I think in your case it's just a matter of setting up proper filtering for what you'd consider an alert-worthy event.

ViliusS commented 3 days ago

I don't understand your arguments. You say that it is set to "something not spammy", but I'm telling you how the image behaves by default under most containerized conditions and it is the opposite.

The workaround solution is not really easy. One would need to write nginx error_log to JSON converter with severity included for every log message. Only then those messages would be logged at correct severity level on SaaS environments. Filtering alerts by simple string (e.g. "gracefully shuting down" or "exiting") would be insane.

Another way would be to completely rebuild docker image. Which can be done, but I filled this issue wanting to fix the issue at the source, not make another workaround.

I also don't buy the argument that something which was designed wrong in the first place cannot be changed now. At least remove symlink redirect at docker image level and leave the control to nginx configuration. error_log already has special parameter to log directly to stderr. This would keep backward compatibility and would allow nginx configuration override without rebuilding the image.