AWS Lambda Support for Flagd

james-milligan commented 1 year ago

Overview

AWS Lambda is becoming an increasingly important component in modern applications. It allows developers to focus solely on the code by offloading operations work to AWS. When creating a Lambda based app, developers still need to be able to use flagd in a similar fashion to how they would use it in Kubernetes.

Considerations

Cold start impact should be low or non-existent
Flagd could run in something like a Lambda layer or as an external service
Flag configurations could be stored in various locations such as S3, DynamoDB or via HTTP.
Flagd should not be modified in any significate way to support FaaS environment.

Requirements

[ ] Research how flagd could be used in a AWS Lambda
[ ] Document the recommended deployment procedure
[ ] Create any tooling required to support AWS Lambda (e.g. lambda layer)

Links

james-milligan commented 1 year ago

AWS lambda functions are designed for short lived processes, with a hardcoded 15 minute maximum for a single executions timeout. As a result they are (currently) an inappropriate deployment platform for flagd. Within flagd we do not have any shutdown conditions for example a lack of requests, or a graceful timeout window. As a result the flagd-in-lambda containers will always terminate with a timeout error. To deploy flagd behind a load balancer in lambda we would need to wrap the evaluator in a new was lambda wrapper, creating a small go binary which would pull configuration / flag configuration from an s3 bucket, the downside to this is every request to the lambda will trigger 1 + n reads from the s3 container, likely dramatically increasing the cost of deployment over alternatives.

func HandleRequest(ctx context.Context, req FlagEvaluationRequest) (FlagEvaluationResponse, error) {
    config := getConfigFromS3(ctx) // read 1

    state := []states{}
    for _, bucketKey := range config.flagConfigurations { // n reads
        s := getFlagConfigurationsFromS3(ctx, bucketKey)
        state = append(state, s)
    }
    evaluator.SetState(state)
    x, y, z := evaluator.ResolveBooleanValue(req.reqID, req.flagKey, req.Context)
    return FlagEvaluationResponse{x, y, z}, nil
}

func main() {
    lambda.Start(HandleRequest)
}

For each request we will need to read the config, providing the keys for the flag configurations in the bucket, then read each of these keys to set the evaluators state before evaluation. This will also be a cold startup for each request. One benefit of this is the request time reading of the flag states, meaning no configuration change events are required to ensure the responses are not out of date.

AWS ECS is a container orchestration service which will allow for flagd instances to run for extended periods of time, and scale dependent upon request rates. In this deployment method we won't be scaffolding around the 15 minute timeout window, meaning we can have a far more favourable deployment pattern. There is also native support for docker images, meaning a developer will not need to first deploy their image to ECR. ECS will also send SIGTERM events to flagd (which we currently listen for to initiate a graceful shutdown).

For flagd to be deployed to ECS, the following code changes will need to be introduced:

A new from config method, pulling a shared configuration file from an s3 bucket / dynamodb
The introduction of a new sync, (AWSSync?) which can read directly from an s3 bucket key to update its state
(Optional) add a subscription to an SQS event queue for detecting change events in each s3 key, allowing for configuration events.

james-milligan commented 1 year ago

To conclude my comment above, I believe the correct deployment model for AWS is as follows:

An optional API gateway (either private or public) for handling the requests made to flagd
An elastic load balancer pointing to the ECS deployment
flagd containers using an s3 configuration object to control AWSSyncs running within ECS
an optional SQS queue to pass configuration change events to the containers running in ECS

james-milligan commented 1 year ago

Another potential deployment method could be to use AWS fargate, the benefit of using this would be a lower entry level on configuration a deployment.

beeme1mr commented 1 year ago

This is out of scope for flagd. The flagd flag configuration could be used in a PaaS environment but not flagd itself.

open-feature / flagd