google / brotli

Brotli compression format
MIT License
13.47k stars 1.23k forks source link

Nginx, Brotli, and hich CPU usage #1187

Open bradp-wordkeeper opened 1 month ago

bradp-wordkeeper commented 1 month ago

Hi All,

We started seeing some odd CPU issues on 1 of our servers about 2 months ago and I'm digging for answers. The ultimate effect that we're seeing is that Nginx started using about 6x the power to serve the same number of requests on the same site as it did before.

Using perf top, I was able to determine that something in Brotli is causing this issue. Also, if I disable Brotli in the Nginx config and let it fall back to Gzip, CPU usage goes back to normal.

The issue started while using Nginx 1.25.5 and we're on 1.27.0 now. We've also gone from Brotli 1.0.9 to 1.1.0. No change during upgrades or recompiles of Nginx, Brotli, or the ngx_brotli module either.

One more data point is that I ran a test in a Fedora 40 container and that seemed to solve the problem. I compiled Nginx in the Fedora container using the exact same settings and build process. Then I just stopped the normal instance of Nginx that we use and started up the containerized version in its place. The CPU went right back to normal during that test and that's with Brotli enabled, but I couldn't leave it like that for various reasons. The only difference being the underlying OS and libraries that it uses in the container, I guess.

So that all makes me think that it could be either a bug in the OS libraries somewhere or a bug in Brotli that only happens under certain conditions. This server OS is CentOS Stream 9.

Any ideas what could be causing this high CPU usage when using Brotli specifically?

Here is Nginx build info if that helps:

nginx version: nginx/1.27.0
built by gcc 11.4.1 20231218 (Red Hat 11.4.1-3) (GCC)
built with OpenSSL 3.2.2 4 Jun 2024
TLS SNI support enabled
configure arguments: --with-cc-opt='-g -O2 -march=x86-64-v3 -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -ftree-vectorize -Wformat -Werror=format-security -fPIC -Wdate-time -D_FORTIFY_SOURCE=2 -DTCP_FASTOPEN=23' --with-ld-opt='-Wl,-Bsymbolic-functions -flto=auto -ffat-lto-objects -flto=auto -Wl,-z,relro -Wl,-z,now -fPIC' --with-openssl-opt='enable-tls1_3 enable-ec_nistp_64_gcc_128 no-nextprotoneg no-weak-ssl-ciphers no-ssl3 enable-ktls' --prefix=/usr/share/nginx --conf-path=/opt/nginx/nginx.conf --http-log-path=/var/log/nginx/access.log --error-log-path=/var/log/nginx/error.log --lock-path=/var/lock/nginx.lock --pid-path=/run/nginx.pid --http-client-body-temp-path=/var/lib/nginx/body --http-fastcgi-temp-path=/var/lib/nginx/fastcgi --http-proxy-temp-path=/var/lib/nginx/proxy --user=nobody --group=nobody --with-pcre-jit --with-http_ssl_module --with-http_stub_status_module --with-openssl=../openssl-3.2.2 --with-zlib=../zlib-ng --with-pcre=../pcre2-10.44 --with-http_realip_module --with-http_auth_request_module --with-http_gzip_static_module --with-http_v2_module --with-http_v3_module --with-http_sub_module --with-libatomic=../libatomic --with-file-aio --with-http_xslt_module --with-http_flv_module --with-http_mp4_module --with-http_gunzip_module --with-threads --without-http_autoindex_module --without-http_ssi_module --without-http_scgi_module --without-http_uwsgi_module --add-module=../ngx_brotli --add-module=../ngx_http_geoip2_module-3.4 --add-module=../njs-0.8.5/nginx
bradp-wordkeeper commented 1 month ago

One more detail that I meant to mention. The perf top command shows the specific function call causing the overhead is CreateBackwardReferencesNH5.

I can add a ton more detail if needed too. :)

lancedockins commented 3 weeks ago

For context, this high CPU use occurred with Brotli comp levels as low as 4 and the Nginx and Brotli types config is the same list of supported MIME types. There is no reason that Zlib at level 6 should take substantially less CPU horsepower than Brotli at level 4.