Closed CodeBlanch closed 3 months ago
Tagging subscribers to this area: @tarekgh, @tommcdon, @pjanotti See info in area-owners.md if you want to be subscribed.
@CodeBlanch isn't there a way to achieve this without basically reimplementing abstractions that the OpenTelemetry packages have already implemented?
Tagging subscribers to this area: @dotnet/area-system-diagnostics-activity See info in area-owners.md if you want to be subscribed.
@julealgon Anything is possible! This issue is really to collect ideas. But my gut is no. For this out-of-proc monitoring scenario there is no OTel available. Some external utility (think dotnet-monitor
) is run to extract telemetry from a running process which doesn't have the OTel SDK hooked up. We also don't want to use the profiler to inject modify/code so this works for AoT-compiled apps where that isn't possible.
@julealgon Anything is possible! This issue is really to collect ideas. But my gut is no. For this out-of-proc monitoring scenario there is no OTel available. Some external utility (think
dotnet-monitor
) is run to extract telemetry from a running process which doesn't have the OTel SDK hooked up. We also don't want to use the profiler to inject modify/code so this works for AoT-compiled apps where that isn't possible.
What if we inverted the problem then?
If there are native .NET scenarios that require sampling, maybe the core System.Diagnostics
libraries should contain the sampling abstractions (like they contain Activity
, Meter
, and other "OTEL-compatible" diagnostics classes) and the OpenTelemetry SDK could just leverage those instead of implementing them from scratch.
Otherwise, this will end up in a ton of potential duplication between .NET and OTEL.
Sure, the easy way out would be to just duplicate the concepts, but that feels like a bad solution long-term to me.
@julealgon Anything is possible! This issue is really to collect ideas. But my gut is no. For this out-of-proc monitoring scenario there is no OTel available. Some external utility (think
dotnet-monitor
) is run to extract telemetry from a running process which doesn't have the OTel SDK hooked up. We also don't want to use the profiler to inject modify/code so this works for AoT-compiled apps where that isn't possible.What if we inverted the problem then?
If there are native .NET scenarios that require sampling, maybe the core
System.Diagnostics
libraries should contain the sampling abstractions (like they containActivity
,Meter
, and other "OTEL-compatible" diagnostics classes) and the OpenTelemetry SDK could just leverage those instead of implementing them from scratch.Otherwise, this will end up in a ton of potential duplication between .NET and OTEL.
Sure, the easy way out would be to just duplicate the concepts, but that feels like a bad solution long-term to me.
This is essentially what we are talking about:
Whether to deprecate the sampling options in OTEL or have that map through to the runtime API is up for debate - what this probably maps to in the OTel Specs is the ParentBasedSampler with a fixed rate sampler for Root spans without a parent.
- Add an API or possibly use the existing hook in ASP.NET core when a request comes in and picking the traceid/Recorded properties, to be able to specify a sampling algorithm.
Just keep in mind this should work just as "natively" outside of ASP.NET Core, like cmdline apps, workers, etc.
It's all based around https://github.com/dotnet/runtime/blob/5535e31a712343a63f5d7d796cd874e563e5ac14/src/libraries/System.Diagnostics.DiagnosticSource/src/System/Diagnostics/ActivityListener.cs#L48C58-L48C62, so is not ASP.NET specific, but that's one of the main consumers.
dotnet/dotnet-api-docs#9978
We have DiagnosticSourceEventSource which enables listening to
Activity
\ traces out of process. The current design when there is a listener is to create allActivity
\ spans and write everything to theEventSource
. We feel for out of process monitoring of traces this will create a lot of overhead in the process being monitored.The request here is to allow the listener to set some kind of sampling algorithm.
Here is the OTel spec for samplers: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/trace/sdk.md#built-in-samplers
What we have today is essentially
AlwaysOn
.One suggestion would be to allow the listener to set a ratio. If ratio is supplied we could sample according to these rules:
Always creating the root span is important for log correlation but we only need to create children if the root is sampled. This is how the OTel .NET SDK works today.
/cc @noahfalk @tarekgh @samsp-msft @rajkumar-rangaraj @cijothomas