open-feature / go-sdk-contrib

Community maintained OpenFeature Providers and Hooks for Go
https://openfeature.dev
Apache License 2.0
43 stars 43 forks source link

Concurrent flag evaluation bursts the back-end #588

Open natenho opened 6 days ago

natenho commented 6 days ago

Hello, consider the following program:

package main

import (
    "context"
    "fmt"
    "sync"

    gofeatureflag "github.com/open-feature/go-sdk-contrib/providers/go-feature-flag/pkg"
    "github.com/open-feature/go-sdk/openfeature"
)

var client *openfeature.Client

func main() {
    options := gofeatureflag.ProviderOptions{
        Endpoint: "http://localhost:1031",
    }

    provider, _ := gofeatureflag.NewProvider(options)

    openfeature.SetProvider(provider)

    // Create a new client
    client = openfeature.NewClient("app")

    wg := &sync.WaitGroup{}
    wg.Add(10)

    // Get the feature flag with multiple go routines
    for i := 0; i < 10; i++ {
        go deepInTheCode()
    }

    wg.Wait()
}

func deepInTheCode() {
    // Evaluate the feature flag
    v2Enabled, _ := client.BooleanValue(
        context.Background(), "v2_enabled", false, openfeature.NewEvaluationContext("foo", nil),
    )
    // Use the returned flag value
    if v2Enabled {
        fmt.Println("v2 is enabled")
    }
}

We know that goff provider caches the flag result.

However, during the first request, if there are concurrent jobs getting the flag value, we get the following effect:

You can see multiple concurrent unnecessary requests coming from the application. Think when we have like thousands of go routine workers. Image

I propose here to leverage https://pkg.go.dev/golang.org/x/sync/singleflight

You may argue that if the context is the same, the flag should be retrieved in advance, but think of applications that would need a big refactoring to allow that to happen, or even parts of the application that are not too close that may need the same flag in the same moment.

Using a single flight group would be an elegant way to prevent that from happening coming from any part of the application, by using an unique key based on the evaluation context, just like the cache does.

The question is in which layer we should apply this logic:

1) Per provider approach In that case, the provider is responsible to take care of the cache warm up and it will need to prevent bursts like that. In our example, we might add the code to the go-feature-flag provider, using the same cache key strategy for the flight key.

2) ofprep provider approach The single flight would always occur in the ofprep provider.

3) Client approach (Provider independent) approach We could add the single flight directly to the client thus avoiding bursts on any kind of back-end provider.

For any of the approaches we might want to move the cache key strategy to a func in FlattenedContext struct and use that both for caching and for single flight key.

thomaspoignant commented 6 days ago

If I have understand correctly, you are talking about concurrent evaluation happening at the same time with the same context and the same flag. And what you want to avoid is to do all the calls to remote system (or even local evaluation for other type of providers) in parallel but rather use singleflight to do it once for everyone?

Based on all your options I am not sure where it makes the most senses to do it. Since GO Feature Flag provider is just a wrapper of the OFREP Provider, I am curious to have the point of view of @Kavindu-Dodan, @lukas-reining or @toddbaert about that.

I am also wondering if it should be part of the SDK or this is up to the provider to decide?

natenho commented 6 days ago

I think the ofprep provider approach would be the best one because it does not interfere with provider specifics. The trade-off here is that the providers will be responsible for managing their own back-end communication strategies. However, considering that the current decision to manage caches is made within each provider, we can continue following this approach and impose the least constraint possible on the provider implementations regarding back-end requests.

toddbaert commented 4 days ago

Could we perhaps build a "wrapper" provider that does this, and takes another provider in it's constructor? That way we would have some composibility? A SingleFlightProvider which then calls whatever "real" provider it wraps?

thomaspoignant commented 1 day ago

Could we perhaps build a "wrapper" provider that does this, and takes another provider in it's constructor? That way we would have some composibility? A SingleFlightProvider which then calls whatever "real" provider it wraps?

@toddbaert This is definitely something I can do in the GO Feature Flag wrapper, but for OFREP if this is something we see often, could it be a configuration of the provider directly maybe?