Closed rmikalkenas closed 8 months ago
Disabling symfony's integration kind of helped with memory leak DD_TRACE_SYMFONY_ENABLED=0
But the weird error regarding reached hook limit is still visible in DD logs
Hi again @rmikalkenas !
I am currently trying to reproduce it without success with the memory leak. I would like to know more information.
First, from what I understand, you may be running a long symfony process that happens to call PlaceholderAction numerous times, installing multiple hooks until it eventually hits the limit.
DD_INSTRUMENTATION_TELEMETRY_ENABLED=0
help in any way?Thanks a lot 😃
Hi, @PROFeNoM I spent some time analyzing an issue.
Before going in to findings, I will provide more details (that might help) on how I build app dockerfile and how it runs on my k8 cluster: I have a separation of concerns for pods based on it's entrypoint - meaning there are 2 types of pods: web and worker. Web pod handles all incoming http requests (roadrunner is used as a webserver) and worker pods are used only for consuming SQS messages (symfony's messenger component handles it). For both types of pods (web and worker), exactly same configuration dockerfile is used, except entrypoints are different (roadrunner vs symfony messenger command). Since roadrunner acts as a long running process (in the same way as a consumer process), therefore DD configuration is same. Codebase used is also exactly the same, except for one type of pod http requests are handled and for another - handling of sqs messages.
Once updated DD extension from 0.90.0 to 0.91.2 version, at first I noticed that all worker pods memory consumption started rising (stairs pattern) by approximately 1mb/minute. Only later on I saw same pattern for web pods.
I knew there were some custom DD instrumentation regarding symfony's messenger:
After spending some time analyzing issue, here are the results with different configurations:
DD_TRACE_SYMFONY_ENABLED=1
and my custom DD instrumentation listeners enabled:Could not add hook to ApiPlatform\Action\PlaceholderAction::__invoke with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog/dd-library/0.91.2/dd-trace-sources/bridge/_generated_integrations.php on line 3860; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
DD_TRACE_SYMFONY_ENABLED=1
and my custom DD instrumentation listeners disabled:Could not add hook to ApiPlatform\Action\PlaceholderAction::__invoke with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog/dd-library/0.91.2/dd-trace-sources/bridge/_generated_integrations.php on line 3860; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
DD_TRACE_SYMFONY_ENABLED=0
and my custom DD instrumentation listeners disabled:Summary:
Let me know if you need more details - I will try to help
Hey @rmikalkenas !
Thanks for your extensive description, specifically for narrowing it down to Symfony. This will save me a lot of time.
Let's tackle the simple things before I try to reproduce your setup as closely as possible. It makes sense to try out this artifact (CI job) first. This artifact includes this change, which will remove the hook on the controller (e.g., ApiPlatform\Action\PlaceholderAction::__invoke
) after it is executed.
Note that this artifact was built from 0.96.0. If you would instead need it from 0.91.2, please tell me, and I'll create another artifact (although you would inevitably have to upgrade to the latest version at one point if the fix ends up being useful 😅)
To use the artifact, please follow the same installation procedure you're used to, but you can use the artifact's link instead.
I'd expect you don't hit the hook limit from this artifact anymore. I don't necessarily expect it to address the memory leak... but let's keep the Christmas magic alive :)
@PROFeNoM update from my end:
Changed DD extension url to https://output.circle-artifacts.com/output/job/8559958b-6c9d-49a2-bc73-41bc24ea5188/artifacts/0/datadog-setup.php
and set DD_TRACE_SYMFONY_ENABLED=1
. All other config - same as previously described.
After monitoring for some time I can confirm that hook limit log does not appear anymore as well as memory consumption is stable (no leak) :tada:
@PROFeNoM thank you for redirecting me to this issue.
I have a very similar issue in many Symfony projects since admins have updated our servers datadog agent version.
These issues are occuring from Symfony commands, not controllers, so messages displayed are like these for example :
Sorry, but I'm a developer and I'm not really used to work on these types of problems.
If I understand informations from @rmikalkenas, solution is to modify directly configuration file(s) on datadog agent/profiler configuration files ? On this https://github.com/DataDog/dd-trace-php ?
Or can we update to a newer version where this problem is not occuring anymore maybe ?
Thanks for your time.
Technical infos :
Symfony 2.8+ PHP 7+ version dd-trace-php : https://github.com/DataDog/dd-trace-php/releases/tag/0.92.2 datadog-agent : 7.49.0
Hi, @rmikalkenas! I'm happy that this artifact addresses your issue. I'll do a PR and include it in the next release. I'll ping you on this issue once it is done 😃
Hi, @guillaumepeano!
_generated_integrations.php on line 4014
This is indeed another similar - yet different - issue, and this time, this hook seems to be installed over and over again. I'm a bit surprised as a safety check is made beforehand 🤔 but anyway.
This time, I've made another (not based on the other fix) artifact (CI Job) which include these changes (More general hook installation).
solution is to modify directly configuration file(s) on datadog agent/profiler configuration files ? On this https://github.com/DataDog/dd-trace-php ?
Hum, depends on what you call configuration files 😅 Basically, when installing the tracer, you are most certainly downloading the extension using the following line:
curl -LO https://github.com/DataDog/dd-trace-php/releases/latest/download/datadog-setup.php
and then running the installer.
To use an artifact, all you have to do is use the artifact link. If you install using datadog-setup.php
, then you can simply use the artifact link I gave you just above; otherwise, if you are using other installation methods such as the apk
or deb
, just pick the one that matches your architecture in the CI Job I linked above.
Using the artifact link for the datadog-setup.php
which includes the proposed fixed, the curl command would instead be:
curl -LO https://output.circle-artifacts.com/output/job/000581e7-73c1-4427-aeb1-e80eee3ac0be/artifacts/0/datadog-setup.php
and then run the new installer.
Or can we update to a newer version where this problem is not occuring anymore maybe ?
Since I cannot replicate the issue, if the suggested solution resolves your problem, I will create a PR, and it will eventually be included in the release, ensuring that this problem does not recur.
Hi @PROFeNoM , I'm working with @guillaumepeano : we have installed the version that corresponds to our infrastructure. Since, we no longer receive messages "datadog.trace.hook_limit". It seems to be ok, thank you.
Hi @VincentRebs!
Understood :) If you ever stumble over this issue again (or another), please do not hesitate to reach back out to us 😃
@PROFeNoM maybe you have an approximate date when this fix is expected to be merged and released?
Hi @rmikalkenas !
Considering we have quite some bug fixes, we were targeting for a release next week. That's, unfortunately, as granular as I can commit to.
Hi @rmikalkenas, @guillaumepeano, and @VincentRebs :wave:
The 0.97.0 was just made, which includes the mentioned PR :smiley:
Hi @PROFeNoM,
We had validated the version of dd-trace-php provided through the artifact generated here. Indeed, this artifact corrected the “datadog.trace.hook_limit” alerts.
However, the 0.97.0 release incorporating the changes no longer corrects these alerts.
And when we look at the comparison between the branch corresponding to the artifact and the 0.97.0 release branch Compare, the modifications differ…
Any ideas?
Hi @NikitaCOEUR
Do you have a sample log of these alerts? I'd like to see whether they originate from controllers or commands.
For context, there were two issues in this thread:
alex/issue/gh2427
branch which led to #2436. This one was mergedalex/issue/gh2427-bis
branch. This branch didn't lead to a PR since this wasn't the issue, and the original code was already taking care of not installing hooks twice:
we have installed the version that corresponds to our infrastructure.
Since, we no longer receive messages "datadog.trace.hook_limit".
It seems to be ok, thank you.
The artifacts you are using correspond to the latter. Do you confirm the associated logs are related to commands?
@PROFeNoM I work with VincentRebs, and I’m talking about the second issue. Here is an example of the logs that are generated with the release 0.97.0 and which no longer appeared with the artifact generated by the branch alex/issue/gh2427-bis.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\DriverightImagesSyncCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\Erp\ErpCatalogSyncCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\Erp\ErpCatalogSetKeysCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\Erp\ErpCatalogUpdateCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OrphanDuplicatesCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\BrandLoaderCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\ColorLoaderCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\ModelLoaderCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\AutoValidBrandsCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\AutoValidModelsCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusDataValidationBundle\Command\OctopusSas\AutoValidProductsCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\Esb\ClientBundle\Api\Command\SyncEntitiesCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\Esb\ClientBundle\Api\Command\RepublishFailedEventsCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusProductLinkBundle\Command\ScrewImportFileCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Could not add hook to Hevea\OctopusProductLinkBundle\Command\ScrewMakeConnectionCommand::run with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 4014; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
...
Ok, I just realized I misunderstood the initial message, and you installed the version _of the artifact that corresponds to your infrastructure 😬 I'll open the PR
@PROFeNoM it's better with release 0.98.0! Thanks you!
Splendid! Thanks for your feedback and patience @NikitaCOEUR 🙇
Hi @PROFeNoM
I have a similar issue with Symfony and Elasticsearch, I've updated DD to 0.98.1 but it didn't help.
[ddtrace] [error] Could not add hook to Elastic\Elasticsearch\Endpoints\Indices::__construct with more than datadog.trace.hook_limit = 100 installed hooks in /opt/datadog-php/dd-trace-sources/bridge/_generated_integrations.php on line 5437; This message is only displayed once. Specify DD_TRACE_ONCE_LOGS=0 to show all messages.
Hi @ErFUN-KH !
Thanks for the report, I see why this is happening; I'm working on a fix right now :+1:
Hi @PROFeNoM, I'm wondering when this will be released.
Hey @ErFUN-KH, this was released with 1.0.0. Unless it's still happening?
Hi @bwoebi, I can't see it in the release note. if you're sure, I'll try.
@ErFUN-KH It was already part of 1.0.0beta1: https://github.com/DataDog/dd-trace-php/releases/tag/1.0.0beta1.
Bug report
k8 pods with long running processes start getting OOM after DD extension upgrade from
0.90.0
to0.91.2
. DD dashboards indicate leaking memory. Log received:PHP version
8.2.13
Tracer or profiler version
0.91.2
Installed extensions
Output of
phpinfo()
o.txt
Upgrading from
0.90.0
->0.91.2