DataDog / dd-trace-java

Datadog APM client for Java
https://docs.datadoghq.com/tracing/languages/java
Apache License 2.0
565 stars 283 forks source link

Unrelated XRay and java-aws-sdk traces #5487

Open miguelaferreira opened 1 year ago

miguelaferreira commented 1 year ago

I'm instrumenting a micronaut application that runs on lambda. I'm using the datadog lambda extension version 44, I'm downloading the agent from https://dtdg.co/latest-java-tracer and configuring it as a javaagent.

I'm setting the following env vars:

I'm setting the following java properties:

I'm facing 2 issues:

Other observations:

Trace screenshots ### XRay trace image ### dd-tracer trace image ### java-aws-sdk trace (from micronaut boostrap context) image
amarziali commented 1 year ago

👋 Hi and thanks for having provided a very detailed description of the issue.

This is an expected behaviour since, the aws java sdk, is using java-aws-sdk as service name for most of its outgoing operation. However, this can be changed and you can act on a single parameter in order to have all the service name set to DD_SERVICE.

To make it happen you can try our either

This should answer to that use case. Please let us know and close the issue if solved.

Cheers

Andrea

miguelaferreira commented 1 year ago

Hi @amarziali 👋

Thanks for getting back to me.

What I would expect is not that all the spans need to have the same service name, just that they get stitched together in the same trace. We see that happening in Ruby for example, where anapplication makes a call to AWS (via the SDK) and that call becomes a span (with a aws service name), but that span gets stitched together in the trace.

I'm providing an example from a Ruby on Rails application we have where the trace starts with a rack.request span, does a call to an external service (also RoR app) using faraday client, the external service request is captured in another rack.request span, and then at some point there is a call to DynamoDB (for which the service name is aws), and the span for that call is integrated in the trace.

image

What I'm trying to understand is how do we make sure that XRay, java application, and AWS SDK spans all come together in the same trace? In the case of XRay, as I mentioned before, sometimes the spans are merged with the application spans sometimes not. When they are not, the application itself does not manage to create spans for the lambda runtime initialisation, so we are actually missing out on important information.

amarziali commented 1 year ago

Hi,

So please let me step back to the original example you provided. When the java tracer is instrumenting aws sdk operation, it does trace propagation using xray tracing headers. It injects this information in the outgoing operation and it eventually extracts it from the inbound operation. If we take a SQS example, it injects the tracing key when producing messages and it extracts that key when receiving messages.

Since this tracing header is propagated and handled when using a datadog tracer, this make possible having those spans belonging to the same trace.

Then, related to java, there is another thing to highlight. As already noticed, java is using a service name:

This explains why the micronaut bootstrap is creating a span with this name. Also this span is created without a tracing context hence floating alone as shown in the image.

To expand more please also consider this use case: a java application receive a sqs message and then store info in a s3 bucket. This will result having

  1. a sqs.consume span with service name DD_SERVICE
  2. a aws.http span with service name java-aws-sdk for the s3 operation

Then, I supposed you wanted to change this behaviour hence I was proposing some options to do service name mapping. But probably you are just fine with this. Otherwise you can probably consider options like DD_SERVICE_MAPPING (https://docs.datadoghq.com/tracing/trace_collection/library_config/java/).

miguelaferreira commented 1 year ago

Let me try to explain what I don't follow.

This explains why the micronaut bootstrap is creating a span with this name. Also this span is created without a tracing context hence floating alone as shown in the image.

Why would this be floating along and not part of the XRay trace that shows the function initialisation? Doesn't the micronaut bootstrap context execute during that initialisation?