[Proposal] Semantic Conventions for Flag Evaluation Context

federicobond commented 1 year ago

Category of Proposal

Specification

Describe your proposal

When the flag evaluation context was specified, it was decided to leave out any semantic conventions for the context attributes. Recently there have been discussions about introducing a flag definition language and an evaluation wire protocol. Both of these initiatives would benefit in different ways from having such conventions, so it's probably time to revisit this topic.

Semantic conventions have enjoyed much success in the OpenTelemetry project. By specifying a standard meaning and a format for attributes sent through the OTel protocol, they allow data coming from different applications to be queried in similar ways, instead of having to learn the conventions of every project. They also make it easier to migrate from one observability vendor to another.

Why is this important for the evaluation wire protocol?

Because most first-party or third-party tools that implement OF will have to send some context to the feature flag backend, most likely some basic user info to perform segmentation. If different backends interpret context keys differently, then it will be hard to achieve interoperability with the wire protocol, since the software will have to be modified to work with the new vendor if it uses different semantics for the context keys.

Why is this important for the flag definition language?

Because without semantic conventions, our flag definitions (if they reference the context) are going to be coupled to our particular applications and vendors, making it harder to migrate and locking us in on both sides: due to the way that our applications codify the keys and the way the vendor on the other side interprets them.

My proposal is to work out a minimal set of semantic conventions based on specific, concrete use cases. None of the keys specified in the semantic conventions will be mandatory, but if they are included, the conventions will suggest a specific format and semantics to enable interoperability.

beeme1mr commented 1 year ago

Perhaps we could start by defining the list of specific use cases that could benefit from defined semantics for evaluation context. It would be interesting to see if existing semantic conventions (OTel, Elastic Common Schema) would overlap enough to be useful for this use case. If not, what's missing?

A minor challenge I see if we were to try and leverage OTel's semantic convention is that they use kebab-casing, which isn't ideal when used as a key in an object literal. However, if we went that route, it may be possible to use OTel's Resource Detector to automatically set evaluation context.

I do like the idea of defining recommendations for evaluation context. They could help people better understand how powerful feature flags can become when provided with proper context and lead to more consistently named evaluation context properties.

federicobond commented 1 year ago

Perhaps we could start by defining the list of specific use cases that could benefit from defined semantics for evaluation context.

Sure! That would be the next step. I only have a limited perspective of the use cases though, so input from other users and vendors is super appreciated.

It would be interesting to see if existing semantic conventions (OTel, Elastic Common Schema) would overlap enough to be useful for this use case. If not, what's missing?

I am in favor of using those as inspiration but the feature flag use cases are different enough that I think it would be best to own the semantics of the few attributes that we standardize. End users can always adopt other standard conventions on top of them that they find useful.

A minor challenge I see if we were to try and leverage OTel's semantic convention is that they use kebab-casing, which isn't ideal when used as a key in an object literal.

OTel uses dot-separated snake casing for the keys, which in most languages must be encoded as a string, so I don't think it would be a problem if that's the case. See here for examplee.

federicobond commented 1 year ago

Another point is that we should discourage sending the kitchen sink as an evaluation context. Required attributes will vary depending on the needs of each project.

Evaluation context is sent synchronously when flags are evaluated server-side, while telemetry data is usually sent async and can be sampled/processed later, so avoiding sending unneeded data is more important in the OpenFeature case.

toddbaert commented 1 year ago

Up to now, we've only had to focus on this sort of interop between components in the same process. In such cases, we've used different approaches to achieve interoperability without any sort of specification overhead. For example, in the OTel hook, we've allowed for a lambda to extract/map arbitrary flag metadata and add it to the span.

I think, howoever, that if we want to implement a flag evaluation wire protocol, we will need these kind of conventions. I also think the conventions may need to include more than only evaluation context (details about the subject of the evaluation). We may benefit from the inclusion of data about the flag and the management system as well (again, consider the OTel hook linked above). There could be value in having standard semantics around some of this data as well (ex: flag last-modified timestamp, associated issue tracking id, flag-owner, managment-url).

konradjniemiec commented 1 year ago

Another point is that we should discourage sending the kitchen sink as an evaluation context. Required attributes will vary depending on the needs of each project.

Evaluation context is sent synchronously when flags are evaluated server-side, while telemetry data is usually sent async and can be sampled/processed later, so avoiding sending unneeded data is more important in the OpenFeature case.

When you mean server-side, do you mean remote evaluation or local evaluation? I would argue that to maximize the value of a feature flagging system, you should maximize the dimensions to target without a code change. I think most providers handle this by doing local evaluation on the server, and on the frontend—front-loading the context on an initial fetch to not send the context every time.

federicobond commented 1 year ago

I meant remote evaluation, in which case maximizing dimensions to target can sometimes add quite a bit of network overhead. Some teams may consider that overhead worth it, but what I would like to discourage is for teams to include unnecessary attributes just because they are specified in the conventions.

The decision to segment users for a feature flag is usually a decision about the future state of the system, so you can plan in advance to include the appropriate context that will allow for that segmentation. In OTel you don't have that luxury: when investigating an issue in production you generally don't know in advance what attributes will be useful in pinpointing the cause, so you want to include as much as possible in the hopes that one of them may offer a clue.

Kavindu-Dodan commented 1 year ago

While OTel is a good example of using semantic conventions, another reference that came to my mind is OpenID Connect standard claims [1]. These claims are standardized and exposed from ID Token and user info endpoint. Similar to this proposal, claims here are useful for future decisions.

From the OF spec point of view, I don't see any blocker on introducing a set of agreed, well-known standard set of attributes useful for flag evaluation. This might be added initially as an appendix entry.

[1] - https://openid.net/specs/openid-connect-core-1_0.html#StandardClaims

jrydberg commented 1 year ago

👍 on having a convention

Wrt kebab case and dot separation: the eval context supports nested structures, and dots are often used to traverse those, so would shy away from using those in key names.

beeme1mr commented 1 year ago

@federicobond would you be willing to open an OFEP for this? There seems to be enough interest in this to move to the next stage.

github-actions[bot] commented 11 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in next 60 days.

open-feature / ofep