open-feature / spec

OpenFeature specification
https://openfeature.dev
Apache License 2.0
639 stars 34 forks source link

Flag Evaluation Lifecycle: An Extensibility Proposal #25

Closed toddbaert closed 2 years ago

toddbaert commented 2 years ago

There's been a few questions and thoughts around integrations and extensibility, especially around telemetry, visibility, custom parsing, validation, and logging. OpenTelemetry support, for instance has been a goal from the start. To maintain flexibility, and keep our API dependencies as low as possible (preferably none) I propose a model for extensibility that defines a "life-cycle" for feature flag evaluation, and exposes life-cycle "hooks" for the purposes of extensibility and customization.

"Hooks" might be classes/functions that implement a SDK-defined interface, perform arbitrary logic at well-defined points during flag evaluation, which can be "registered" by the first party feature flag API consumer. For example, Open Telemetry implementation might look something like this, in psuedocode:

class OpenTelemetryHook implements OpenFeatureHook {

private Tracer tracer;

...

  before(Context context, FlagId flagId): Context {
    // optionally handle actions to do before the flag value is retrieved in this life-cycle hook, such as initializing traces/logging.
    // this might not be the API we want, we want to consider the risk associated with mutating the context
    context.span = tracer.startSpan(flagId);
    return context; 
  }

  after(Context context, FlagId flagId, FlagValue flagValue): FlagValue {
    // optionally handle actions to do after the flag value is retrieved in this life-cycle hook, such as parsing, validation, etc.
    // this could be a good place for custom parsing of the flag response, validation, etc
    context.span.end();
    return FlagValue;
  }

  finally (Context context, FlagId flagId) {
    // optionally do some cleanup task regardless of success/failure of flag evaluation.
  }

  error (Context context, FlagId flagId, Error err) {
    // optionally handle errors with flag resolution in this life-cycle hook.
  }

}

Then, in the OpenFeature API, the first party application author would register this hook (note: they did not WRITE the OpenTelemetryHook, they are just consuming it and adding it to the flag evaluation life-cycle):

openFeatureClient.registerHook([ new OpenTelemetryHook() ]);

When the flag is evaluated, the API internals might run the defined hooks within the context of that flag evaluation:

class OpenFeatureClient {

  private Provider provider; 

  getFlagValue(FlagId flagId, Context context) {

    // perform "before" life-cycle hooks
    Context mergedContext = this.hooks.reduce((accumulated, hook) => merge(accumulated, hook.before(context, flagId)));
} 

    // call the provider abstraction to actually get the feature flag value
    var flagValue = this.provider.getFlagValue(flagId, context);

    // perform "after" lifecycle hooks
    flagValue = this.hooks.reduce((accumulated, hook) => hook.after(context, flagId, accumulated)));
    this.hooks.each(hook => hook.after(context, flagId, flagValue));

    return flagValue;
  }
}

I think that an optional configuration object with optional before and after callbacks having the same method signature, could also be added to feature flag evaluation calls, so that we could implement hooks on a per-flag basis as well:

getFlagValue('my-flag', context, { before: (Context context, FlagId flagId) => { // do stuff } })

Considerations like thread-safety, mutability of objects, sync vs async, and error handling as well as the overall API should all be carefully considered here.

justinabrahms commented 2 years ago

I think this makes lots of sense. The only thing that comes to mind that you might have missed here is that in a version of implementation where we have a local cache of the rules that are evaluated, having some hook for when the rules update may be interesting. Like, I would certainly want to log that it happened and maybe emit some relevant metrics.

Just thinking through examples of things I might want to use to do extensibility, I didn't come up with much aside from what's listed above.

toddbaert commented 2 years ago

The only thing that comes to mind that you might have missed here is that in a version of implementation where we have a local cache of the rules that are evaluated

Excellent point. One thing I'm not sure of is how much of that concern we want to expose to the 1st party consumer, and how much should be abstracted by the provider. It seems like in some recent discussions, it's been made clear that having control and visibility over fetching locally cached values versus forcing a pull from the server might be useful. Do you have any ideas with regards how we might want that to look to the application author?

having some hook for when the rules update may be interesting.

I agree this is important. I dont know if it's part of the flag evaluation life-cycle per-se. I wonder if it would make sense to also have some events available on the client that callbacks could be registered to. This way, arbitrary logic could be applied when the fetch happens, or when rules change, which might be totally independent of a flag evaluation (polling updates from the server, for example).

patricioe commented 2 years ago

I like this.

One addition I would suggest is a good for when the SDK is ready to operate. Usually means (and depending to the provider) that the flag definitions are all loaded or parse. I said loaded or parse as different providers may operate differently.

Some of our customers like to log or report when the SDK is "ready" (that is how we called it at Split) or when the loading timeout (in which case they want to report that too).

InTheCloudDan commented 2 years ago

Excellent point. One thing I'm not sure of is how much of that concern we want to expose to the 1st party consumer, and how much should be abstracted by the provider. It seems like in some recent discussions, it's been made clear that having control and visibility over fetching locally cached values versus forcing a pull from the server might be useful. Do you have any ideas with regards how we might want that to look to the application author?

Control and visibility over fetching flags, in what context are we referring, for the provider or consumer?

I generally think of the flag evaluations, including the type of delivery, would fit more in a provider configuration. The consumers would want to be concerned that their flags are evaluating correctly but the underlying how of that evaluation is happening would be left to the providers.

moredip commented 2 years ago

Do we want to allow these hooks to modify the flag evaluation pipeline, or not. In other words, are they hooks which are purely observing the flag evaluation without having any effect on it, or are they middleware that could potentially effect the evaluation of the flag as they are called.

IMHO it'd be a lot simpler to start with the pure observer hook semantics, at least for now, just because nailing down the semantics for a middleware approach would be pretty tricky. We could come back at a later stage and introduce a distinct middleware-style extension point, if it ends up being needed.

toddbaert commented 2 years ago

@moredip I think manipulating the context (altering properties before evaluation, as well as storing state therein, such as OTel spans) would be required for even basic use cases. Manipulating the response however, might not be required.

When you ask about "pure observer" hooks, do you consider alterations to the context to be a violation of that?

Also - in the even that an hook throws, in your model, I assume that wouldn't impact flag evaluation. Is that right?

moredip commented 2 years ago

ah yeah, I guess you're right that you'd need to have somewhere to store that state. The Otel span is a great example.

I was starting to say that I still think we should prohibit modifying the direct inputs and outputs (i.e. the Context coming in, and the evaluated value coming out), but now that I think about it I agree with you that there'd likely be use cases where we want to use a before hook to inject additional context into a flagging decision. So now I'm re-considering my "pure observer" stance 😆

I don't think I have a well-formed opinion on what the semantics should be for a hook throwing an error. My gut feel is that that shouldn't cause the evaluation to fail, but I'm not really sure TBH

s-sen commented 2 years ago

Should the error function be a part of OpenTelemetry hook? It can be a part of a separate diagnostics hook as described here since client developers typically should not have be exposed to the telemetry details (which should be handled by the provider) but needs the diagnostics data for troubleshooting.

toddbaert commented 2 years ago

Should the error function be a part of OpenTelemetry hook?

OTel has semantics around errors. We have a basic implementation here, which adds standard OTel error information in the error() handler: https://github.com/open-feature/sdk-research/blob/main/packages/openfeature-extra/src/lib/hooks/open-telemetry-hook.ts#L52

Since this deals specifically with OTel semantics, I think the OTel hook is the appropriate place.

since client developers typically should not have be exposed to the telemetry details

This doesn't expose developers to telemetry details any more than the other telemetry functions. I think to do that, as you say, a separate debugging/logging hook is a good idea, and that seems different from the OTel error data, which is about getting those errors into the telemetry system.

toddbaert commented 2 years ago

I'm going to close this since it's captured in the current spec. Feel free to open PRs or further discussions if needed.