Open davidfowl opened 1 year ago
@ReubenBond we're assuming you're doing this next, per backlog review
please also update dashboard to display it meaningfully
If we invent new activity tags we should follow the https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/.
cc @noahfalk
Perhaps server.address
is what it should be adding (from https://opentelemetry.io/docs/specs/otel/trace/semantic_conventions/span-general/#serveraddress).
Wouldn't that overwrite the real address?
It's described as representing the "logical server hostname" and doesn't seem to be populated at all right now.
@noahfalk is there a way to enrich the outgoing activity without creating a new one?
It's described as representing the "logical server hostname"
I think there are different ways you could interpret 'logical', but OTel's interpretation is request.RequestUri.Host
doesn't seem to be populated at all right now.
It is populated if folks use OpenTelemetry's instrumentation: https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Instrumentation.Http/Implementation/HttpHandlerDiagnosticListener.cs#L189
Eventually we probably want to absorb that into HttpClient so I'd be cautious about doing anything that conflicts.
@noahfalk is there a way to enrich the outgoing activity without creating a new one?
Sure.
Activity.Current.SetTag("MyCoolTag", "ReallyImportantData");
This assumes that your service discovery code is running while HttpClient's activity is the current one. I'm not sure where it actually runs?
If the Host
header is that, is that enough, or does it have to be the Uri.Host
value?
It is populated if folks use OpenTelemetry's instrumentation: https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Instrumentation.Http/Implementation/HttpHandlerDiagnosticListener.cs#L189
Should we be updating our ServiceDefaults project template to use this then?
If the Host header is that, is that enough, or does it have to be the Uri.Host value?
Oh with your latest change, will this just work?
With this change we get "net.host.name" in the traces.
Nope, I was wrong, this is the server span, not the client span. We want to see the outbound activity with this.
yea, so I finally re-read this and we need to update our service defaults. I'll get on that.
@davidfowl @ReubenBond can you update here, is the work left all on David?
Setting the host name seems to break everything (even on ACA) and we turned it off by default so AFAIK, nothing has changed here. We have no trace of what the original service name was once service discovery has run
@ReubenBond I updated this issue to only be about the outgoing activity, that is the high order bit. Since the host name plan was a bust, I suggest we attempt to do this finding a way to mutate the outgoing activity (if possible).
@noahfalk After a cursory glance, I can't see an obvious way to get in between here and add extra state to the outgoing activity. Worst case scenarios it looks like we would need to do some kinda of otel specific trace enrichment? Is that right?
I can't see an obvious way to get in between here ...
You mean the point in time where your code runs doesn't overlap the time period where the Activity you want to modify is active? Various options:
Its possible there are more scenario specific options but at this point I only have a very cursory understanding of how service discovery is hooked up. Above are the generic options that could be used for any two components.
An ActivityListener can hook the ActivityStarted callback and do arbitrary modification to an Activity. You would have to stash the state somewhere earlier so that you can retrieve it, likely in an AsyncLocal.
The static wire up of this bugs me. I wouldn't want this in library code. It's it ironic that we have to set state on an async local to propagate that state to another async local 😄.
Similar to the ActivityListener you could also use an OTel specific processor. Downside is it takes an explicit dependency on OTel library, upside is it runs with predictable ordering relative to whatever inspection/modification of the Activity OTel is going to do.
This might be OK for now. We can build a ServiceDiscovery.Otel package that could wire this up. It's a little unfortunate we can't do this out of the box yet but by .NET 9, we should hopefully have something more convenient.
Engineer a way for the component creating the data to flow it to the HttpClient handler component that is creating the Activity. This is similar in principle to the enrichment for metrics, except now it is enrichment for Activity.
The code here is pretty much the outermost HttpMessageHandler, AFAIK DiagnosticsHandler is the inner most handler. There's no real easy way to run another handler at that layer. If we could control the creation of it, we could write up it manually in the right order so that we could run while the activity was active. This is much like how the incoming ASP.NET Core activity works.
@noahfalk I think the cleanest solution would be to expose the DiagnosticHandler and allow it to be added to the pipeline earlier so that other delegating handlers can see the outgoing activity. We'd also need a local way to disable the inner SocketsHttpHandler creating one... I'll need to think about it some more.
I experimented with an otel processor but I also need to expose an async local that stores the state from the service discovery system... which is not ideal.
An alternative solution is a way to add a callback to a HttpRequestMessage
that DiaganosticHandler
runs. It would be responsible for enriching the activity. This approach is what was added for metrics in .NET 8.
@JamesNK did some magic here in the dashboard specifically to reverse engineer server names from the outgoing address:
@ReubenBond do you want to punt this until later or do you want to produce a ServiceDiscovery.Otel pacakge that adds data to the outgoing span using enrichment?
@noahfalk @tarekgh is this possible in .NET 9?
No. Request to support adding tags to HttpClient activities: https://github.com/dotnet/runtime/issues/96263
When looking at traces, it's hard to back track to the original information when trying to track down a potential problem:
This is because HTTP client is the one creating the activity and the service discovery system's information is after resolution of the underlying address information. We should enrich the outgoing telemetry context with tags that say what the service name was for a particular request.