dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.09k stars 4.7k forks source link

`.NET 9.0` Native Trace Instrumentation Support for HttpClient as per OTel specification #93019

Open vishweshbankwar opened 1 year ago

vishweshbankwar commented 1 year ago

Current State

Currently, users depend on the instrumentation library provided by the OpenTelemetry .NET repository to enable the enrichment of activities created during HttpClient requests, following the OpenTelemetry specification. This instrumentation library enriches activities by subscribing to DiagnosticSource events generated by HttpClient. However, this approach has certain limitation and performance overhead of Diagnostic Source listeners and reflection.

Feature Request

We are requesting the addition of native instrumentation support for HttpClient in .NET 9.0. This enhancement will complement the out-of-the-box OTel metrics instrumentation introduced in .NET 8.0.

Design Considerations

When designing this feature, we need to consider the following aspects:

  1. Support for Enrichment and Filtering of Spans: Ensure that the native instrumentation supports the enrichment and filtering of HttpClient spans.

  2. Custom Propagators: Address the differences in API between propagators in OpenTelemetry and the DistributedContextPropagator. Determine how custom propagators should be handled in this context.

ghost commented 1 year ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
#### Current State Currently, users depend on the instrumentation library provided by the [OpenTelemetry .NET repository](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http) to enable the enrichment of activities created during HttpClient requests, following the OpenTelemetry [specification](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client). This instrumentation library enriches activities by subscribing to DiagnosticSource events generated by HttpClient. However, this approach has certain [limitation](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/issues/2012). #### Feature Request We are requesting the addition of native instrumentation support for HttpClient in `.NET 9.0`. This enhancement will complement the out-of-the-box OTel metrics instrumentation introduced in `.NET 8.0`. #### Design Considerations When designing this feature, we need to consider the following aspects: 1. **Support for Enrichment and Filtering of Spans:** Ensure that the native instrumentation supports the [enrichment](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#enrich-httpclient-api) and [filtering](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#filter-httpclient-api) of HttpClient spans. 2. **Custom Propagators:** Address the differences in API between [propagators](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Api/Context/Propagation/TextMapPropagator.cs) in OpenTelemetry and the `DistributedContextPropagator`. Determine how custom propagators should be handled in this context.
Author: vishweshbankwar
Assignees: -
Labels: `area-System.Net.Http`
Milestone: -
vishweshbankwar commented 1 year ago

@tarekgh @cijothomas FYI

cijothomas commented 1 year ago

However, this approach has certain https://github.com/open-telemetry/opentelemetry-dotnet-contrib/issues/2012.

And the performance overhead of ds listeners/reflection etc.

ghost commented 12 months ago

Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti See info in area-owners.md if you want to be subscribed.

Issue Details
#### Current State Currently, users depend on the instrumentation library provided by the [OpenTelemetry .NET repository](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http) to enable the enrichment of activities created during HttpClient requests, following the OpenTelemetry [specification](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client). This instrumentation library enriches activities by subscribing to DiagnosticSource events generated by HttpClient. However, this approach has certain [limitation](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/issues/2012). #### Feature Request We are requesting the addition of native instrumentation support for HttpClient in `.NET 9.0`. This enhancement will complement the out-of-the-box OTel metrics instrumentation introduced in `.NET 8.0`. #### Design Considerations When designing this feature, we need to consider the following aspects: 1. **Support for Enrichment and Filtering of Spans:** Ensure that the native instrumentation supports the [enrichment](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#enrich-httpclient-api) and [filtering](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#filter-httpclient-api) of HttpClient spans. 2. **Custom Propagators:** Address the differences in API between [propagators](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Api/Context/Propagation/TextMapPropagator.cs) in OpenTelemetry and the `DistributedContextPropagator`. Determine how custom propagators should be handled in this context.
Author: vishweshbankwar
Assignees: -
Labels: `area-System.Diagnostics.Tracing`, `area-System.Net.Http`, `untriaged`
Milestone: -
antonfirsov commented 12 months ago

Moved this to area-System.Diagnostics.Tracing since it seems to touch higher level distributed tracing concepts.

/cc @noahfalk

ghost commented 8 months ago

Tagging subscribers to this area: @dotnet/ncl See info in area-owners.md if you want to be subscribed.

Issue Details
#### Current State Currently, users depend on the instrumentation library provided by the [OpenTelemetry .NET repository](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http) to enable the enrichment of activities created during HttpClient requests, following the OpenTelemetry [specification](https://github.com/open-telemetry/semantic-conventions/blob/main/docs/http/http-spans.md#http-client). This instrumentation library enriches activities by subscribing to DiagnosticSource events generated by HttpClient. However, this approach has certain [limitation](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/issues/2012) and performance overhead of Diagnostic Source listeners and reflection. #### Feature Request We are requesting the addition of native instrumentation support for HttpClient in `.NET 9.0`. This enhancement will complement the out-of-the-box OTel metrics instrumentation introduced in `.NET 8.0`. #### Design Considerations When designing this feature, we need to consider the following aspects: 1. **Support for Enrichment and Filtering of Spans:** Ensure that the native instrumentation supports the [enrichment](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#enrich-httpclient-api) and [filtering](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Instrumentation.Http#filter-httpclient-api) of HttpClient spans. 2. **Custom Propagators:** Address the differences in API between [propagators](https://github.com/open-telemetry/opentelemetry-dotnet/blob/main/src/OpenTelemetry.Api/Context/Propagation/TextMapPropagator.cs) in OpenTelemetry and the `DistributedContextPropagator`. Determine how custom propagators should be handled in this context.
Author: vishweshbankwar
Assignees: -
Labels: `enhancement`, `area-System.Net.Http`
Milestone: 9.0.0
tarekgh commented 8 months ago

Talked offline with @noahfalk and changed it back to http as the work will be in http.

antonfirsov commented 5 months ago

@vishweshbankwar a question regarding the timing of tag injection and enrichment:

opentelemetry-dotnet's HttpHandlerDiagnosticListener sets some of the tags at System.Net.Http.HttpRequestOut.Start, wile others are being set at System.Net.Http.HttpRequestOut.Stop. Similarly, enrichment is possible both at the start (EnrichWithHttpRequestMessage) and at the end (EnrichWithHttpResponseMessage & EnrichWithException) of the request. If in HttpClients instrumentation we could emit everything at the end instead, that would allow us to design a simpler, more compact tracing enrichment API which would be more consistent with our current Metrics enrichment offering. I wonder if there is a practical necessity for knowing the tag values at System.Net.Http.HttpRequestOut.Start? Are you aware of use-cases which actually depend on this?

vishweshbankwar commented 5 months ago

@vishweshbankwar a question regarding the timing of tag injection and enrichment:

opentelemetry-dotnet's HttpHandlerDiagnosticListener sets some of the tags at System.Net.Http.HttpRequestOut.Start, wile others are being set at System.Net.Http.HttpRequestOut.Stop. Similarly, enrichment is possible both at the start (EnrichWithHttpRequestMessage) and at the end (EnrichWithHttpResponseMessage & EnrichWithException) of the request. If in HttpClients instrumentation we could emit everything at the end instead, that would allow us to design a simpler, more compact tracing enrichment API which would be more consistent with our current Metrics enrichment offering. I wonder if there is a practical necessity for knowing the tag values at System.Net.Http.HttpRequestOut.Start? Are you aware of use-cases which actually depend on this?

In the current implementation which is based on the diagnostic source callbacks, there was no way to allow enrich via single API (due to different callbacks with different information). If that constraint is not valid for instrumentation within the library, it makes sense to have a simpler API.

antonfirsov commented 3 months ago

Support for standard tags has been implemented in #104251. We still need to add enrichment, and investigate the API differences for custom propagators. Hopefully this will happen in .NET 10.

julealgon commented 3 months ago

I wonder if there is a practical necessity for knowing the tag values at System.Net.Http.HttpRequestOut.Start? Are you aware of use-cases which actually depend on this?

@antonfirsov doesn't sampling take place at the beginning and thus would be able to take those initial tags into account? Or does it happen even before that and thus this would make no difference?

cijothomas commented 3 months ago

I wonder if there is a practical necessity for knowing the tag values at System.Net.Http.HttpRequestOut.Start? Are you aware of use-cases which actually depend on this?

@antonfirsov doesn't sampling take place at the beginning and thus would be able to take those initial tags into account? Or does it happen even before that and thus this would make no difference?

The initial tags are available for Samplers, so yes they are useful for sampler. I believe the question was - "are there practical examples of samplers that would make use of these additional tags?".

Given this is a Client span, which usually has a parent span, the sampling decision typically follows that of the parent's decision. So compared to Asp.Net Core (server spans), initial tags have less use case for HTTP Client.

julealgon commented 3 months ago

@cijothomas I'd imagine this could be quite useful for a custom sampler implementation, where the author may add custom tags via enrichment, and then leverage them in their own sampling logic.

Some stuff that immediately comes to my mind here is in a multi-tenant system (like ours) where there we could add the tenant_id attribute via enrichment, and then have a custom sampler that conditionally samples based on that value. Imagine for example we want to investigate a specific problem in one tenant and we turn on sampling just for that tenant: this would be achievable this way.

If all enrichment happened at the end only, this would be impossible to achieve.

Somewhat related:

cijothomas commented 3 months ago

@julealgon You are right. I only meant to highlight that, this is less valuable than getting the initial tags for asp.net core.

(for the tenant_id example - won't you want to sample the asp.net server span as well? Changes in httpclient won't help there, right?)

julealgon commented 3 months ago

@cijothomas

@julealgon You are right. I only meant to highlight that, this is less valuable than getting the initial tags for asp.net core.

Ah that makes sense. Agreed.

(for the tenant_id example - won't you want to sample the asp.net server span as well? Changes in httpclient won't help there, right?)

Not all flows start from AspNetCore in our case, but you are mostly right: usually the httpclient spans will have another parent anyways that we could enrich and check for the tenant.

As another example that would better fit the httpclient spans, we have another scenario in our system where some of our vendor-related HTTP calls are based on specific OEMs, which we have an enum for. We could enrich such httpclient spans with the specific OEM ID and then filter based on that later on a sampler. This is another real scenario that we have today and I do intend to allow something like this because our current way of "tracing" those OEM specific calls is completely custom. In this case, even though the parent span would have a specific tenant, for a single tenant, there might be many different such OEM integrations, which we might want to track/debug separately (say, have different sampling percentages based on the OEM to diagnose a specific problem without incurring extra overhead on the observability tool quotas/generating extra expenses for us.

antonfirsov commented 3 months ago

The initial tags are available for Samplers

There was a lot of back-and-forth whether we should do this or not, see discussion under https://github.com/dotnet/runtime/pull/104251#discussion_r1663610456.

In the end, we decided not to do it, for perf reasons. Even if we add support, IMO it should be opt-in. Please open a separate issue if this is important for you, with detailed explanation of the use-case so we can discuss and triage.

julealgon commented 3 months ago

Thanks for the update @antonfirsov . I would rather the team opted to just provide the 2 mechanisms out of the gate than limit it like that. For instance, give the consumer the option to enrich at the beginning or at the end, and make it clear that there is a penalty when doing it in the beginning, at least for now, until the allocation issues can be resolved/mitigated.

Hopefully this will be reconsidered in the future. I'll open a request if it comes to a point where we need one of the samplers I mentioned above since we don't have them quite yet and we may be able to get away with a processor to discard the spans like the comment mentioned in the comment thread you linked.

antonfirsov commented 3 months ago

to just provide the 2 mechanisms out of the gate than limit it like that

That needs an API, which complicates things given that the bar is very high in BCL when it comes to defining API-s. Rule of thumb: if we think that there are maybe some users who will use it, we're not gonna do it. If there are users who definitely need it, we will maybe do it -- this depends on the number users knocking on our gates, which can be quantified eg. by the number of upvotes on the issue that asks for a particular knob.