Closed Samjin closed 1 month ago
The router would only originate a trace Id when it doesn't get one from the headers. Do you have tracing propagation setup or any exporters enabled? Can you share more of your configuration?
Put another way, the thing you're asking for should definitely already work. The router is meant to play nice with existing distributed tracing headers, including passing them to subgraphs as fit.
@BrynCooke Anything to add here?
The router would only originate a trace Id when it doesn't get one from the headers. Do you have tracing propagation setup or any exporters enabled? Can you share more of your configuration?
I tried it and left comment here. https://github.com/apollographql/router/pull/4823#issuecomment-2085656172
datadog:
enabled: true
enable_span_mapping: true
batch_processor:
...
endpoint: "xx"
propagation:
datadog: true
request:
header_name: trace-id
I think it trimmed off the . Our downstream received value like -
from UUIDtraceid=28e9fb547e465690f288cdb96194d119
which failed their UUID validation.
Update: i'll take another look tmr to observe the change.
I still get a different trace_id than trace-id header used above. The log is from subgraph_service
.
And, subgraph would receive this error(invalid UUID) from trace-id header. So somewhere overwrite trace-id header.
Also the following trace ids are mutual exclusive in router span attrs.
router:
attributes:
trace_id: true
trace-id:
request_header: trace-id
In debug mode, I get following error
"message": "cannot generate custom trace_id: invalid digit found in string",
"target": "apollo_router::plugins::telemetry",
looks like trace_id is recreated because it couldn't parse UUID header.
What is your subgraph server implemented in and what telemetry mechanism (SDK, format, etc.) are you using in it?
@Samjin I think that this PR will likely fix things for you: https://github.com/apollographql/router/pull/5071 We are targeting this for the next release.
I'm using latest dev, both custom and apollo-router events are logged with trace_id
, which is now using original UUID but without dashes, but the id is only added in deprecate
spans mode. trace_id
doesn't appear inspec_compliant
logs.
That said, even if it is successfully added in spec_compliant
mode, it's still a different trace id than the UUID that up/downstream services use. So I'm not completely sure how removing -
from UIUD would solve this issue.
I think to fix this temporarily, the easy way maybe just allow user to have a req header selector to create custom attribute for all logs, then they can just disable the default trace_id. Doing this way may require user to add their own trace id to their custom events.
What is your subgraph server implemented in and what telemetry mechanism (SDK, format, etc.) are you using in it?
They use opentracing in SDK for datadog. It looks like trace context use trace-id https://github.com/opentracing/specification/blob/master/rfc/trace_identifiers.md#trace-context-http-headers
But still Im not completely sure they are related, it'd be great if anyone know how they work together.
I think we can close this issue. We already allow parsing a trace-id
from any given header but to remain OpenTelemetry compliant, that header must a 16-btye random generated string and not something that includes dashes. We are even now striping dashes from any header value if we see them to propagate the correct value.
If you have clients passing in trace ids with dashes those are not valid traces id and you should fix those at the source
Now that being said, it would be helpful if you could include some company specific ids in you logs as standard attributes and we have documented that separately here: https://github.com/apollographql/router/issues/5212
To give a summary, these are mentioned in the thread above.
trace_id
and span_id
to user's custom event in spec_compliant
mode like deprecated
mode does. That said, #1 is still a concern for us as we can't easily update infra requirements. If https://github.com/apollographql/router/issues/5212 can provide alternative solution, then yes. I'm fine closing this.
@Samjin If the JSON log formatter had the option to toggle specific attributes, would that work? https://github.com/apollographql/router/issues/5212#issuecomment-2148120150
I believe this was fixed in #5802. Please reopen if this is not the case.
Is your feature request related to a problem? Please describe.
There is an existing infrastructure that utilizes a dedicated request header for trace ID. In Router, it'd make sense to use that as the value of OTEL
trace_id
for all logging, metrics, and span tags. This could reduce multiple trace ids labeled everywhere and also helps to correlate request logs from apollo-router, otherwise user may have to lookup 2 different trace ids to find a request logs.i.e.
As you can see above, the trace_id is not the one our infrastructure use, but generated by apollo router.
i.e. we use
trace_id
for Splunk and Datadog in current Node gateway, and would like to continue to use same names in Router. Ifdisplay_trace_id: true
is enabled from logging, there maybe be two trace_ids that could potentially overwrite each other if located at same level(I have not check if this is the case today but maybe possible?).Describe the solution you'd like
Have an option to override default
trace_id
value from a request header's UUID like value. This should update default trace_id value everywhere, including span attributes, logging, metrics etc.trace id header will be created in router_service if it doesn't exist, this created header should be able to propagate as trace_id as well