DataDog / dd-trace-php

Datadog PHP Clients
https://docs.datadoghq.com/tracing/setup/php
Other
501 stars 155 forks source link

[Bug]: Too many open files #2914

Open philvv opened 3 weeks ago

philvv commented 3 weeks ago

Bug report

We have been using DD for years with no issue and in the last while, through upgrading our application to php 8.3, we reinstalled datadog.

Since then our servers are crashing due to 'too many open files'. I would assume the version running before was less than 1.4 but I have no record, as the server was rebuilt.

I rebooted all servers through the night, as our application was failing. They have been running for 8 hours. See output from commands on 1 server running ddtrace:

cat /proc/sys/fs/file-nr 582840 0 2097152

sudo lsof | cut -d " " -f 1 | uniq -c | sort -nr | head -n 10 406259 php8.3 225341 dd-ipc-he 175165 php 13272 php-fpm8. 3648 agent 2268 php8.3 1899 php-fpm8. 1725 nginx 1422 php-fpm8. 1408 trace-age

I have disabled ddtrace on 1 server and open files is holding under 10,000.

cat /proc/sys/fs/file-nr 4680 0 2097152

sudo lsof | cut -d " " -f 1 | uniq -c | sort -nr | head -n 10 18482 php-fpm8. 3474 nginx 2500 agent 1803 php-fpm8. 1212 php8.3 753 php-fpm8. 600 php-fpm8. 594 snapd 552 process-a 525 trace-age

I can only assume this is being cause by ddtrace?

PHP version

8.3.10

Tracer or profiler version

1.4.2

Installed extensions

[PHP Modules] bcmath bz2 calendar Core ctype curl date ddappsec ddtrace dom exif FFI fileinfo filter ftp gd gettext gmp hash iconv igbinary intl json libxml mbstring mysqli mysqlnd openssl pcntl pcre PDO pdo_mysql Phar posix random readline redis Reflection session shmop SimpleXML sockets sodium SPL standard sysvmsg sysvsem sysvshm tokenizer xml xmlreader xmlwriter xsl Zend OPcache zip zlib

[Zend Modules] Zend OPcache ddappsec ddtrace

Output of phpinfo()

No response

Upgrading from

No response

bwoebi commented 3 weeks ago

Thanks for the report @philvv,

I haven't been observing that yet with ddtrace. Are you able to provide information on what files / sockets are being held open? Is the memory usage of dd-ipc-helper also growing linearly with the number of files?

I'm seeing 406259 php8.3 here - I assume these are PHP CLI processes. Is that a single process? Many CLI processes which never end despite that they could?

philvv commented 3 weeks ago

Hi,

Are you able to provide information on what files / sockets are being held open? You will need to point to what you need me to show, as I am no server expert.

Is the memory usage of dd-ipc-helper also growing linearly with the number of files? Memory use is growing linearly overtime until it starts to fail. I then reboot, it troughs and grows again linearly. See pic:

Image

Memory use, used to be stable at about 6 gig but now getting over 20 gig (application use is less than August, so its not user related). 25th August was when we recreated the servers running php 8.3 and installed whatever version of dd would have been latest at that time. See pic:

Image

I assume these are PHP CLI processes Its a laravel application running horizon, so cli would be horizon workers and cron jobs. We run about 5 crons per minute. The queue is not heavily used, so typically 2 processes in waiting state per server.

bwoebi commented 3 weeks ago

@philvv Hm, is it possible for you to join https://chat.datadoghq.com/ and contact me there (you'll find me easily)? For better communication? We've tried to reproduce such an issue already multiple times, but haven't managed to do so ourselves.

The ipc-helper process is definitely not supposed to use gigabytes of memory. We're in fact restarting the sidecar automatically if it uses too much memory, but I suppose memory used by file descriptions might not be included in the self-reporting of helper memory.

philvv commented 3 weeks ago

Hi @bwoebi,

It maybe worth mentioning that our application has a consistent throughput of 1,400 requests per second, which is spread across 6 servers, so roughly 230 rps per server.

Ye, I will try and join your channel tomorrow.