dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.04k stars 4.68k forks source link

Support W3C Baggage propagation without Activity #103174

Open lmolkova opened 3 months ago

lmolkova commented 3 months ago

Related to https://github.com/dotnet/runtime/issues/45496

W3C Baggage defines a propagation format for arbitrary application-specific properties. Despite being driven by the observability community, it's not related to W3C Trace-Context and could work without distributed tracing.

In .NET however, baggage API is part of the System.Diagnostics.Activity which allows to get/set baggage. It implies that users have to have distributed tracing enabled in order to propagate baggage.

To support W3C Baggage fully, .NET would need to define an API similar to OTel Baggage:

It could be done in a backward compatible way such that Activity.Baggage would work over NewBaggage, but it would be great to consider an alternative approach of sunsetting Activity.Baggage or at least methods allowing to add/modify it.

dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @dotnet/area-system-diagnostics-activity See info in area-owners.md if you want to be subscribed.

julealgon commented 3 months ago

In .NET however, baggage API is part of the System.Diagnostics.Activity which allows to get/set baggage. It implies that users have to have distributed tracing enabled in order to propagate baggage.

Where is that implication coming from? Isn't the entire System.Diagnostics set of types created in a way that is 100% agnostic? Activity itself has been in the framework for years and years before OTEL was even introduced.

lmolkova commented 3 months ago

In order to add/get/propagate a baggage, someone needs to create an Activity first. Activity is a distributed tracing primitive when used by OTel or without OTel.

There is nothing otel-specific in this issue description, so I'm not sure I understand your concern @julealgon.

julealgon commented 3 months ago

@lmolkova could you elaborate a bit on why this is needed and to which use cases it would apply?

lmolkova commented 3 months ago

service-a:

NewBaggage.Current.Add("foo", "bar");
DistributeContextPropagator.Current.InjectBaggage(NewBaggage.Current, httpRequest, ...);

service-b

var baggage = DistributeContextPropagator.Current.ExtractBaggage(NewBaggage.Current, httpRequest, ...);
NewBaggage.Current = new NewBaggage(baggage)

var foo = NewBaggage.Current.Get("foo");
...
// do anything with it.

Someone can use baggage to augment their telemetry (e.g. metrics or logs), to route requests in their application, or do absolutely anything with it.

julealgon commented 3 months ago

Thanks for the example @lmolkova , that helps.

All that static .Current provider pattern mess looks awful to me though... I wish we could achieve the same with properly injected services instead.

tarekgh commented 3 months ago

CC @noahfalk @samsp-msft

cijothomas commented 3 months ago

All that static .Current provider pattern mess looks awful to me though... I wish we could achieve the same with properly injected services instead.

Not sure what is the alternative here? .Current is static, but backed by asynclocal....

julealgon commented 3 months ago

All that static .Current provider pattern mess looks awful to me though... I wish we could achieve the same with properly injected services instead.

Not sure what is the alternative here? .Current is static, but backed by asynclocal....

To be fair I don't know either, was just a bit disappointed by the fact it's just a random static thing decoupled from everything else.

Wouldn't it make sense to somehow couple this with the Assembly or the Process, instead of them being "orphaned" classes/properties? What about the IHostEnvironment abstraction from Microsoft.Extension.Hosting, wouldn't that be a good fit?

My sentiment when seeing all this static provider-based stuff is that I'm working on a lower-level language.

cijothomas commented 3 months ago

static provider-based stuff

I am not sure if I understood the question well. What is "provider" here? Sure, OpenTelemetry has notion of various providers, but DiagnosticSource has no such concept of "providers".

julealgon commented 3 months ago

static provider-based stuff

I am not sure if I understood the question well. What is "provider" here? Sure, OpenTelemetry has notion of various providers, but DiagnosticSource has no such concept of "providers".

What I meant by "provider" is the pattern that allows you to assign and retrieve an implementation in a static fashion, which is usually modelled with this .Current method.

I was basically referring to this old system:

Provider pattern is one of the most interesting features that Microsoft introduced in .NET 2. A lot of features including membership providers, roles providers, profile providers, health monitor event providers, site map providers, and more had the same design concept. These features are not changeable but extendable, and that is the beauty of the provider pattern.

When DI became "a thing", this "style" was largely abandoned, but there are still a few usages of the pattern in modern code.

I'd personally rather see something around the hosting abstractions than this.

cijothomas commented 3 months ago

Thanks for clarifying the provider part!

Activity.Current, Baggage.Current etc. are mostly for implementing https://github.com/open-telemetry/opentelemetry-specification/tree/main/specification/context#get-current-context part. I don't know if there is a better way to implement it apart from current approach of static, backed by async-local...

lmolkova commented 3 months ago

The baggage and propagator need to be available to HttpClient and libraries that work without dependency injection. So some level of static singletons is inevitable for the time being. It would be nice if everything allowed to provide things explicitly via API, but in order for default experience to work without a massive code change, we still need a global configurable default (even if it's a global provider of something).

KevinCathcart commented 3 months ago

It honestly isn't clear what type in .NET is supposed to map to OTel Context. That class is an immutable mapping of key value pairs. It must be possible to capture and restore it (trivial for explicit contexts, requires an API for implicit contexts), and it must provide APIs to retrieve a value from it by a key, and to create a new context with an additional key value pair.

It almost feels like Like Activity is trying to fill this role (in addition to being the representation of a Span), but as I will discuss, that has problems.

If we actually look for a class that fits, the closest we get is ExecutionContext. This can be captured and restored, and is immutable, at least with respect to AsyncLocal values (it is not fully immutable on .NET Framework, but is on modern .NET). Unfortunately, it lacks a public API to directly read a value from it, or create a new one with an additional value. Indeed, the only way to do so would be to capture the current, restore the one to modify, set an AsyncLocal Static field, capture the updated context, and restore the previous one.

The OTel specification requires that Baggage be an immutable collection stored within the context, with an add method that returns a new collection (users would then use appropriate APIs to add it to a context). Obviously helper APIs to create a new context with an additional baggage item is fine too, and/or one that updates the implicit context. It also needs a remove API (.NET currently supports remove via a SetBaggage API, where setting a value of null is same as removing).

However, .NET is non-conforming here. It has no reified baggage type (which is not strictly a violation, APIs directly on the context are also legal). It stores baggage in the activity rather than the context, and the current set baggage method updates the current Activity in place. Additionally trying to use SetBaggage with null to remove baggage only works with respect to baggage on the current activity, not for baggage inherited from a parent activity. .NET also currently fails to provide the required API for removing all baggage from a context (or to be more accurate: creating a new context with all baggage info removed, possibly setting that as the implicit context), which the OTEL spec considers critical to avoid sending potentially sensitive information to an untrusted API.

Besides the obvious issue of being unable to remove baggage from parent activities, and being unable to clear all baggage, the current .NET design means that it is not possible to capture the Context (be it ExecutionContext or Activity), prior to adding baggage, adding the baggage for some operations, and then restoring to a context without that baggage.

dxynnez commented 2 months ago

I want to add that the Otel Baggage is no where close to an AsyncLocal - see https://github.com/open-telemetry/opentelemetry-dotnet/issues/3257