DataDog / dd-trace-java

Datadog APM client for Java
https://docs.datadoghq.com/tracing/languages/java
Apache License 2.0
583 stars 289 forks source link

Losing Jetty 10 servlet traces and metrics in 1.19.0+ #5734

Open Hexcles opened 1 year ago

Hexcles commented 1 year ago

We recently encountered a hard-to-reproduce issue in tracing and tracing metrics after upgrading our agent to 1.19+. Everything would start out fine, but after about an hour, we'd stop getting traces from our GRPC endpoints (e.g. servlet.request /domain/method) along with the corresponding trace metrics.

As far as we can tell, this doesn't happen with 1.18.2 or 1.18.3, but happens with both 1.19.0 and 1.19.1.

smola commented 1 year ago

@Hexcles :wave:

If you contact support, we can take a look and speed up troubleshooting.

Hexcles commented 1 year ago

I'll try to reproduce again and file a support ticket. What information would you need btw? Flare?

Hexcles commented 1 year ago

Here's the content of -Ddatadog.slf4j.simpleLogger.logFile:

[dd.trace 2023-09-07 21:39:26:461 +0000] [dd-task-scheduler] INFO datadog.trace.agent.core.StatusLogger - DATADOG TRACER CONFIGURATION {"version":"1.20.1~70cd67ce90","os_name":"Linux","os_version":"5.10.184-175.731.amzn2.x86_64","architecture":"amd64","lang":"jvm","lang_version":"17.0.8.1","jvm_vendor":"Eclipse Adoptium","jvm_version":"17.0.8.1+1","java_class_version":"61.0","http_nonProxyHosts":"null","http_proxyHost":"null","enabled":true,"service":"<redacted>","agent_url":"<redacted>","agent_unix_domain_socket":"/var/run/datadog/apm.socket","agent_error":false,"debug":false,"trace_propagation_style_extract":["datadog"],"trace_propagation_style_inject":["datadog"],"analytics_enabled":false,"sampling_rules":[{},{}],"priority_sampling_enabled":true,"logs_correlation_enabled":true,"profiling_enabled":true,"remote_config_enabled":true,"debugger_enabled":false,"appsec_enabled":"ENABLED_INACTIVE","telemetry_enabled":true,"dd_version":"240f05dfa15915f7b8b2882eb0443fbe60872b26","health_checks_enabled":true,"configuration_file":"no config file present","runtime_id":"<redacted>","logging_settings":{"levelInBrackets":false,"dateTimeFormat":"'[dd.trace 'yyyy-MM-dd HH:mm:ss:SSS Z']'","logFile":"/var/log/datadog.log","configurationFile":"simplelogger.properties","showShortLogName":false,"showDateTime":true,"showLogName":true,"showThreadName":true,"defaultLogLevel":"INFO","warnLevelString":"WARN","embedException":false},"cws_enabled":false,"cws_tls_refresh":5000,"datadog_profiler_enabled":true,"datadog_profiler_safe":true,"datadog_profiler_enabled_overridden":false}
Hexcles commented 1 year ago

In other words, it is still present in 1.21.1. Filing a support ticket, too (1334466).

smola commented 1 year ago

@Hexcles Thank you. We'll look into this. We might ask for further information through the support ticket if we need it.

smola commented 1 year ago

Just for the record, this is an issue in Jetty 10 instrumentation introduced in dd-trace-java v1.19.0. The workaround is setting -Ddd.integration.jetty.enabled=false (system property) or DD_INTEGRATION_JETTY_ENABLED=false (environment variable), which will fallback to the generic servlet instrumentation and should result in the same behavior for Jetty 10 as dd-trace-java releases previous to v1.19.0.

damar-block commented 11 months ago

Hey, is there any update on this issue? it is impacting us as well (I verified DD_INTEGRATION_JETTY_ENABLED workaround works, is there any downside for using it?)