DataDog / dd-trace-php

Datadog PHP Clients
https://docs.datadoghq.com/tracing/setup/php
Other
497 stars 155 forks source link

[BUG] Infinite loop in ddtrace_serialize_span_to_array #1812

Closed Hunrik closed 1 year ago

Hunrik commented 1 year ago

Bug description

Hi We have a CLI sqs consumer process, which after a couple of minutes, gets stuck and starts running at 100% CPU. NOTE: It is also impacting web requests, but it occurs less frequently

We created the following profile with bpf-trace

Screenshot 2022-11-22 at 15 55 16

From this, it seems stuck in a loop in dd-trace-php/ext/serializer.c ddtrace_serialize_span_to_array, which I'm suspecting that got added in this change

After reverting to 0.80.0 from 0.81.1 the issue is gone.

PHP version

PHP 8.1.12 (cli) (built: Nov 15 2022 04:49:28) (NTS)
Copyright (c) The PHP Group
Zend Engine v4.1.12, Copyright (c) Zend Technologies
    with Zend OPcache v8.1.12, Copyright (c), by Zend Technologies
    with ddtrace v0.81.1, Copyright Datadog, by Datadog
    with datadog-profiling v0.10.2, Copyright Datadog, by Datadog
    with ddappsec v0.4.4, Copyright Datadog, by Datadog

Tracer version

0.81.1

Installed extensions

OS info

php:8.1.12-fpm -> Debian GNU/Linux 11 (bullseye)

Diagnostics and configuration

Output of phpinfo() (ddtrace >= 0.47.0)

``` Datadog PHP tracer extension For help, check out the documentation at https://docs.datadoghq.com/tracing/languages/php/ (c) Datadog 2020 Datadog tracing support => enabled Version => 0.81.1 DATADOG TRACER CONFIGURATION => { "date": "2022-11-22T14:04:04Z", "os_name": "Linux 5.4.209-116.367.amzn2.x86_64 #1 SMP Wed Aug 31 00:09:52 UTC 2022 x86_64", "os_version": "5.4.209-116.367.amzn2.x86_64", "version": "0.81.1", "lang": "php", "lang_version": "8.1.8", "env": "production", "enabled": true, "service": "foo", "enabled_cli": true, "agent_url": "http:\/\/172.20.83.234:8126", "debug": false, "analytics_enabled": false, "sample_rate": 1, "sampling_rules": [], "tags": [], "service_mapping": [], "distributed_tracing_enabled": true, "priority_sampling_enabled": true, "dd_version": "production-5.19.15", "architecture": "x86_64", "sapi": "cli", "datadog.trace.request_init_hook": "\/opt\/datadog\/dd-library\/0.81.1\/dd-trace-sources\/bridge\/dd_wrap_autoloader.php", "open_basedir_configured": false, "uri_fragment_regex": null, "uri_mapping_incoming": null, "uri_mapping_outgoing": null, "auto_flush_enabled": false, "generate_root_span": true, "http_client_split_by_domain": false, "measure_compile_time": true, "report_hostname_on_root_span": false, "traced_internal_functions": null, "auto_prepend_file_configured": false, "integrations_disabled": "default", "enabled_from_env": true, "opcache.file_cache": null } Diagnostics Diagnostic checks => passed Directive => Local Value => Master Value ddtrace.disable => Off => Off ddtrace.cgroup_file => /proc/self/cgroup => /proc/self/cgroup datadog.trace.request_init_hook => /opt/datadog/dd-library/0.81.1/dd-trace-sources/bridge/dd_wrap_autoloader.php => /opt/datadog/dd-library/0.81.1/dd-trace-sources/bridge/dd_wrap_autoloader.php ddtrace.request_init_hook => /opt/datadog/dd-library/0.81.1/dd-trace-sources/bridge/dd_wrap_autoloader.php => /opt/datadog/dd-library/0.81.1/dd-trace-sources/bridge/dd_wrap_autoloader.php datadog.trace.agent_url => no value => no value datadog.agent_host => 172.20.83.234 => 172.20.83.234 datadog.dogstatsd_url => no value => no value datadog.distributed_tracing => On => On datadog.dogstatsd_port => 8125 => 8125 datadog.env => production => production datadog.autofinish_spans => Off => Off datadog.trace.url_as_resource_names_enabled => On => On datadog.integrations_disabled => default => default datadog.priority_sampling => On => On datadog.service => foo => foo datadog.service_name => foo => foo datadog.service_mapping => no value => no value datadog.tags => no value => no value datadog.trace.global_tags => no value => no value datadog.trace.agent_port => 8126 => 8126 datadog.trace.analytics_enabled => Off => Off datadog.trace.auto_flush_enabled => Off => Off datadog.trace.cli_enabled => On => On datadog.trace.measure_compile_time => On => On datadog.trace.debug => Off => Off datadog.trace.enabled => On => On datadog.trace.health_metrics_enabled => Off => Off datadog.trace.health_metrics_heartbeat_sample_rate => 0.001 => 0.001 datadog.trace.db_client_split_by_instance => Off => Off datadog.trace.http_client_split_by_domain => Off => Off datadog.trace.redis_client_split_by_host => On => On datadog.trace.memory_limit => no value => no value datadog.trace.report_hostname => Off => Off datadog.trace.resource_uri_fragment_regex => no value => no value datadog.trace.resource_uri_mapping_incoming => no value => no value datadog.trace.resource_uri_mapping_outgoing => no value => no value datadog.trace.resource_uri_query_param_allowed => no value => no value datadog.trace.http_url_query_param_allowed => * => * datadog.trace.rate_limit => 0 => 0 datadog.trace.sample_rate => 1 => 1 datadog.sampling_rate => 1 => 1 datadog.trace.sampling_rules => [] => [] datadog.span_sampling_rules => [] => [] datadog.span_sampling_rules_file => no value => no value datadog.trace.header_tags => no value => no value datadog.trace.x_datadog_tags_max_length => 512 => 512 datadog.trace.propagate_service => Off => Off datadog.propagation_style_extract => Datadog,B3,B3 single header => Datadog,B3,B3 single header datadog.propagation_style_inject => Datadog => Datadog datadog.trace.traced_internal_functions => no value => no value datadog.trace.agent_timeout => 500 => 500 datadog.trace.agent_connect_timeout => 100 => 100 datadog.trace.debug_prng_seed => -1 => -1 datadog.log_backtrace => Off => Off datadog.trace.generate_root_span => On => On datadog.trace.spans_limit => 5000 => 5000 datadog.trace.agent_max_consecutive_failures => 3 => 3 datadog.trace.agent_attempt_retry_time_msec => 5000 => 5000 datadog.trace.bgs_connect_timeout => 2000 => 2000 datadog.trace.bgs_timeout => 5000 => 5000 datadog.trace.agent_flush_interval => 5000 => 5000 datadog.trace.agent_flush_after_n_requests => 10 => 10 datadog.trace.shutdown_timeout => 5000 => 5000 datadog.trace.startup_logs => On => On datadog.trace.agent_debug_verbose_curl => Off => Off datadog.trace.debug_curl_output => Off => Off datadog.trace.beta_high_memory_pressure_percent => 80 => 80 datadog.trace.warn_legacy_dd_trace => On => On datadog.trace.retain_thread_capabilities => Off => Off datadog.version => production-5.19.15 => production-5.19.15 datadog.trace.obfuscation_query_string_regexp => (?i)(?:p(?:ass)?w(?:or)?d|pass(?:_?phrase)?|secret|(?:api_?|private_?|public_?|access_?|secret_?)key(?:_?id)?|token|consumer_?(?:id|key|secret)|sign(?:ed|ature)?|auth(?:entication|orization)?)(?:(?:\s|%20)*(?:=|%3D)[^&]+|(?:"|%22)(?:\s|%20)*(?::|%3A)(?:\s|%20)*(?:"|%22)(?:%2[^2]|%[^2]|[^"%])+(?:"|%22))|bearer(?:\s|%20)+[a-z0-9\._\-]|token(?::|%3A)[a-z0-9]{13}|gh[opsu]_[0-9a-zA-Z]{36}|ey[I-L](?:[\w=-]|%3D)+\.ey[I-L](?:[\w=-]|%3D)+(?:\.(?:[\w.+\/=-]|%3D|%2F|%2B)+)?|[\-]{5}BEGIN(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY[\-]{5}[^\-]+[\-]{5}END(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY|ssh-rsa(?:\s|%20)*(?:[a-z0-9\/\.+]|%2F|%5C|%2B){100,} => (?i)(?:p(?:ass)?w(?:or)?d|pass(?:_?phrase)?|secret|(?:api_?|private_?|public_?|access_?|secret_?)key(?:_?id)?|token|consumer_?(?:id|key|secret)|sign(?:ed|ature)?|auth(?:entication|orization)?)(?:(?:\s|%20)*(?:=|%3D)[^&]+|(?:"|%22)(?:\s|%20)*(?::|%3A)(?:\s|%20)*(?:"|%22)(?:%2[^2]|%[^2]|[^"%])+(?:"|%22))|bearer(?:\s|%20)+[a-z0-9\._\-]|token(?::|%3A)[a-z0-9]{13}|gh[opsu]_[0-9a-zA-Z]{36}|ey[I-L](?:[\w=-]|%3D)+\.ey[I-L](?:[\w=-]|%3D)+(?:\.(?:[\w.+\/=-]|%3D|%2F|%2B)+)?|[\-]{5}BEGIN(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY[\-]{5}[^\-]+[\-]{5}END(?:[a-z\s]|%20)+PRIVATE(?:\s|%20)KEY|ssh-rsa(?:\s|%20)*(?:[a-z0-9\/\.+]|%2F|%5C|%2B){100,} datadog.trace.forked_process => On => On datadog.trace.agent_max_payload_size => 52428800 => 52428800 datadog.trace.agent_stack_initial_size => 131072 => 131072 datadog.trace.agent_stack_backlog => 12 => 12 datadog.trace.cakephp_enabled => On => On datadog.trace.cakephp_analytics_enabled => Off => Off datadog.cakephp_analytics_enabled => Off => Off datadog.trace.cakephp_analytics_sample_rate => 1 => 1 datadog.cakephp_analytics_sample_rate => 1 => 1 datadog.trace.codeigniter_enabled => Off => Off datadog.trace.codeigniter_analytics_enabled => Off => Off datadog.codeigniter_analytics_enabled => Off => Off datadog.trace.codeigniter_analytics_sample_rate => 1 => 1 datadog.codeigniter_analytics_sample_rate => 1 => 1 datadog.trace.curl_enabled => On => On datadog.trace.curl_analytics_enabled => Off => Off datadog.curl_analytics_enabled => Off => Off datadog.trace.curl_analytics_sample_rate => 1 => 1 datadog.curl_analytics_sample_rate => 1 => 1 datadog.trace.elasticsearch_enabled => On => On datadog.trace.elasticsearch_analytics_enabled => Off => Off datadog.elasticsearch_analytics_enabled => Off => Off datadog.trace.elasticsearch_analytics_sample_rate => 1 => 1 datadog.elasticsearch_analytics_sample_rate => 1 => 1 datadog.trace.eloquent_enabled => On => On datadog.trace.eloquent_analytics_enabled => Off => Off datadog.eloquent_analytics_enabled => Off => Off datadog.trace.eloquent_analytics_sample_rate => 1 => 1 datadog.eloquent_analytics_sample_rate => 1 => 1 datadog.trace.guzzle_enabled => On => On datadog.trace.guzzle_analytics_enabled => Off => Off datadog.guzzle_analytics_enabled => Off => Off datadog.trace.guzzle_analytics_sample_rate => 1 => 1 datadog.guzzle_analytics_sample_rate => 1 => 1 datadog.trace.laravel_enabled => On => On datadog.trace.laravel_analytics_enabled => Off => Off datadog.laravel_analytics_enabled => Off => Off datadog.trace.laravel_analytics_sample_rate => 1 => 1 datadog.laravel_analytics_sample_rate => 1 => 1 datadog.trace.lumen_enabled => Off => Off datadog.trace.lumen_analytics_enabled => Off => Off datadog.lumen_analytics_enabled => Off => Off datadog.trace.lumen_analytics_sample_rate => 1 => 1 datadog.lumen_analytics_sample_rate => 1 => 1 datadog.trace.memcached_enabled => Off => Off datadog.trace.memcached_analytics_enabled => Off => Off datadog.memcached_analytics_enabled => Off => Off datadog.trace.memcached_analytics_sample_rate => 1 => 1 datadog.memcached_analytics_sample_rate => 1 => 1 datadog.trace.mongo_enabled => Off => Off datadog.trace.mongo_analytics_enabled => Off => Off datadog.mongo_analytics_enabled => Off => Off datadog.trace.mongo_analytics_sample_rate => 1 => 1 datadog.mongo_analytics_sample_rate => 1 => 1 datadog.trace.mongodb_enabled => On => On datadog.trace.mongodb_analytics_enabled => Off => Off datadog.mongodb_analytics_enabled => Off => Off datadog.trace.mongodb_analytics_sample_rate => 1 => 1 datadog.mongodb_analytics_sample_rate => 1 => 1 datadog.trace.mysqli_enabled => Off => Off datadog.trace.mysqli_analytics_enabled => Off => Off datadog.mysqli_analytics_enabled => Off => Off datadog.trace.mysqli_analytics_sample_rate => 1 => 1 datadog.mysqli_analytics_sample_rate => 1 => 1 datadog.trace.nette_enabled => On => On datadog.trace.nette_analytics_enabled => Off => Off datadog.nette_analytics_enabled => Off => Off datadog.trace.nette_analytics_sample_rate => 1 => 1 datadog.nette_analytics_sample_rate => 1 => 1 datadog.trace.pcntl_enabled => On => On datadog.trace.pcntl_analytics_enabled => Off => Off datadog.pcntl_analytics_enabled => Off => Off datadog.trace.pcntl_analytics_sample_rate => 1 => 1 datadog.pcntl_analytics_sample_rate => 1 => 1 datadog.trace.pdo_enabled => On => On datadog.trace.pdo_analytics_enabled => Off => Off datadog.pdo_analytics_enabled => Off => Off datadog.trace.pdo_analytics_sample_rate => 1 => 1 datadog.pdo_analytics_sample_rate => 1 => 1 datadog.trace.phpredis_enabled => On => On datadog.trace.phpredis_analytics_enabled => Off => Off datadog.phpredis_analytics_enabled => Off => Off datadog.trace.phpredis_analytics_sample_rate => 1 => 1 datadog.phpredis_analytics_sample_rate => 1 => 1 datadog.trace.predis_enabled => Off => Off datadog.trace.predis_analytics_enabled => Off => Off datadog.predis_analytics_enabled => Off => Off datadog.trace.predis_analytics_sample_rate => 1 => 1 datadog.predis_analytics_sample_rate => 1 => 1 datadog.trace.slim_enabled => Off => Off datadog.trace.slim_analytics_enabled => Off => Off datadog.slim_analytics_enabled => Off => Off datadog.trace.slim_analytics_sample_rate => 1 => 1 datadog.slim_analytics_sample_rate => 1 => 1 datadog.trace.symfony_enabled => Off => Off datadog.trace.symfony_analytics_enabled => Off => Off datadog.symfony_analytics_enabled => Off => Off datadog.trace.symfony_analytics_sample_rate => 1 => 1 datadog.symfony_analytics_sample_rate => 1 => 1 datadog.trace.web_enabled => On => On datadog.trace.web_analytics_enabled => Off => Off datadog.web_analytics_enabled => Off => Off datadog.trace.web_analytics_sample_rate => 1 => 1 datadog.web_analytics_sample_rate => 1 => 1 datadog.trace.wordpress_enabled => Off => Off datadog.trace.wordpress_analytics_enabled => Off => Off datadog.wordpress_analytics_enabled => Off => Off datadog.trace.wordpress_analytics_sample_rate => 1 => 1 datadog.wordpress_analytics_sample_rate => 1 => 1 datadog.trace.yii_enabled => Off => Off datadog.trace.yii_analytics_enabled => Off => Off datadog.yii_analytics_enabled => Off => Off datadog.trace.yii_analytics_sample_rate => 1 => 1 datadog.yii_analytics_sample_rate => 1 => 1 datadog.trace.zendframework_enabled => Off => Off datadog.trace.zendframework_analytics_enabled => Off => Off datadog.zendframework_analytics_enabled => Off => Off datadog.trace.zendframework_analytics_sample_rate => 1 => 1 datadog.zendframework_analytics_sample_rate => 1 => 1 ```
bwoebi commented 1 year ago

Hey @Hunrik,

~Thanks for the report, are you able to confirm on what line exactly the infinite loop occurs? If you attach gdb on such a process (e.g. gdb -p <pid>) while it hangs in that loop, it should give you a line number.~

We found it must occur in serializer.c:849. We're checking what could cause a span to be its own parent.