RFC: parameter retrieval utility

aws-powertools / powertools-lambda-python

A developer toolkit to implement Serverless best practices and increase developer velocity.

https://docs.powertools.aws.dev/lambda/python/latest/

MIT No Attribution

2.88k stars 397 forks source link

RFC: parameter retrieval utility #94

Closed nmoutschen closed 4 years ago

nmoutschen commented 4 years ago

Key information

RFC PR: #96
Related issue(s), if known:
Area: Utilities
Meet tenets: Yes

Summary

Add a utility to facilitate retrieval of parameters from the SSM Parameter store or Secrets Manager. This would have support for caching parameter values, preventing retrieving the value at every execution.

Motivation

Many Lambda users use either the Parameter Store or Secrets Manager to store things such as feature flags, third-party API keys, etc. In a serverless context, a simple approach is to either fetch at every invocation (which might be costly/run into limits) or fetch at initialisation (meaning no control over expiration and refresh). This would simplify that experience for customers.

Proposal

This utility should provide a very simple to use interface for customers that don't want to deep-dive into how this utility works. To prevent hitting the throughput limit for the Parameter Store, this should have a default cache value in the single digit seconds (e.g. 5).

Basic usage

# For SSM Parameter
from aws_lambda_powertools.utilities import get_parameter
# For Secrets Manager
from aws_lambda_powertools.utilities import get_secret

def handler(event, context):
    param = get_parameter("my-parameter")
    secret = get_secret("my-secret")

Changing the default cache duration

from aws_lambda_powertools.utilities import get_parameter

def handler(event, context):
    # Only refresh after 300 seconds
    param = get_parameter("my-parameter", max_age=300)

Convert from specific format

from aws_lambda_powertools.utilities import get_parameter

def handler(event, context):
    # Transform into a dict from a json string
    param = get_parameter("my-parameter", format="json")

    # Transform into bytes from base64
    param = get_parameter("my-parameter", format="binary")

Retrieve multiple parameters from a path

from aws_lambda_powertools.utilities import get_parameters

def handler(event, context):
    params = get_parameters("/param/path")
    # Access the item using a param.name notation
    print(params.Subparam)

    # Other modifications are supported
    params = get_parameters("/param/path", format="json")
    print(params.Subparam["key"])

    # Supports recursive fetching
    params = get_parameters("/param/path", recursive=true)
    print(params.Subparam.Subsubparam)

Drawbacks

This would add a dependency on boto3. Many functions probably use it in some form, but the Powertools don't require it directly at the moment. However, this is already pulled through the X-Ray SDK.
Many problems around parameters can be solved using environment variables, thus the usefulness is limited to cases where value could change with a short notice.

Rationale and alternatives

What other designs have been considered? Why not them? Replicating ssm-cache-python feature set, however this might be too feature-rich for this use-case.
What is the impact of not doing this? Users who want to retrieve dynamic parameters will have to think about the expiration logic if they don't want to risk getting throttles at scale.

Unresolved questions

Optional, stash area for topics that need further development e.g. TBD

heitorlessa commented 4 years ago

Thanks for creating the RFC, Nicolas :)

Two questions for you:

Do we want to support multiple parameters? E.g fetch all parameters by path

Why - Helps those fetching more than one parameter with a single call to SSM

Do we want to have a decorator for that too? e.g @fetch/get_parameter

Why - Mostly syntactic sugar. It’d be a no brainer given we have a decorator factory, though I haven’t put much thought into it as to naming conventions

On Mon, 27 Jul 2020 at 18:07, Nicolas Moutschen notifications@github.com wrote:

Key information

RFC PR: (leave this empty)

Related issue(s), if known:

Area: Utilities

Meet tenets https://awslabs.github.io/aws-lambda-powertools-python/#tenets: Yes

Summary

Add a utility to facilitate retrieval of parameters from the SSM Parameter store or Secrets Manager. This would have support for caching parameter values, preventing retrieving the value at every execution. Motivation

Many Lambda users use either the Parameter Store or Secrets Manager to store things such as feature flags, third-party API keys, etc. In a serverless context, a simple approach is to either fetch at every invocation (which might be costly/run into limits) or fetch at initialisation (meaning no control over expiration and refresh). This would simplify that experience for customers. Proposal

This utility should provide a very simple to use interface for customers that don't want to deep-dive into how this utility works. To prevent hitting the throughput limit for the Parameter Store, this should have a default cache value in the single digit seconds (e.g. 5).

Basic usage

For SSM Parameterfrom aws_lambda_powertools.utilities import get_parameter# For Secrets Managerfrom aws_lambda_powertools.utilities import get_secret

def handler(event, context): param = get_parameter("my-parameter") secret = get_secret("my-secret")

Advanced usage

from aws_lambda_powertools.utilities import get_parameter def handler(event, context):

Only refresh after 300 seconds
param = get_parameter("my-parameter", max_age=300)
Drawbacks

This would add a dependency on boto3. Many functions probably use it in some form, but the Powertools don't require it at the moment.

Many problems around parameters can be solved using environment variables, thus the usefulness is limited to cases where value could change with a short notice.

Rationale and alternatives

What other designs have been considered? Why not them? Replicating ssm-cache-python https://github.com/alexcasalboni/ssm-cache-python feature set, however this might be too feature-rich for this use-case.

What is the impact of not doing this? Users who want to retrieve dynamic parameters will have to think about the expiration logic if they don't want to risk getting throttles at scale.

Unresolved questions

Optional, stash area for topics that need further development e.g. TBD

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/awslabs/aws-lambda-powertools-python/issues/94, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBEPXOU3TNKMIQIWFSLR5WQ4XANCNFSM4PI5VOAA .

nmoutschen commented 4 years ago

Do we want to support multiple parameters? E.g fetch all parameters by path

I lack data here to know how frequently this is used.

In the context of a API key parameter or feature flags, this could be done through a single parameter. I'm also thinking that only supporting a single parameter would encourage by design to use a single parameter because of the Parameter Store API limits.

The default throughput limit is 40 RPS, which means a maximum concurrency of 200 assuming a timeout of 5. Users could encounter unexpected behaviour or higher API calls because of how GetParameterByPath works: If the service reaches an internal limit while processing the results, it stops the operation and returns the matching values up to that point and a NextToken.

Do we want to have a decorator for that too? e.g @fetch/get_parameter

How would developers access the parameter value within the function? I'm not sure how to design this so it looks convenient from a developer perspective.

heitorlessa commented 4 years ago

That makes sense on 1 - Let’s begin with a single parameter. If we hear otherwise from customers we could always extend that afterwards.

On 2. I was thinking on either injecting on context, or introducing a shared state object within Powertools that can be retrieved anywhere in their code (default dict + Borg)

There’s pros and cons to both, but if we want to do that, then I think the latter gives us flexibility to allow other customers who are using Chalice like Michael express on the other RFC

On Tue, 28 Jul 2020 at 08:24, Nicolas Moutschen notifications@github.com wrote:

Do we want to support multiple parameters? E.g fetch all parameters by path

I lack data here to know how frequently this is used.

In the context of a API key parameter or feature flags, this could be done through a single parameter. I'm also thinking that only supporting a single parameter would encourage by design to use a single parameter because of the Parameter Store API limits https://docs.aws.amazon.com/general/latest/gr/ssm.html#limits_ssm.

The default throughput limit is 40 RPS, which means a maximum concurrency of 200 assuming a timeout of 5. Users could encounter unexpected behaviour or higher API calls because of how GetParameterByPath https://docs.aws.amazon.com/systems-manager/latest/APIReference/API_GetParametersByPath.html works: If the service reaches an internal limit while processing the results, it stops the operation and returns the matching values up to that point and a NextToken.

Do we want to have a decorator for that too? e.g @fetch/get_parameter

How would developers access the parameter value within the function? I'm not sure how to design this so it looks convenient from a developer perspective.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/awslabs/aws-lambda-powertools-python/issues/94#issuecomment-664803236, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBE5H3QGY3IKHVOBBY3R5ZVIDANCNFSM4PI5VOAA .

nmoutschen commented 4 years ago

I agree on the shared state, there's potential usefulness there beyond the parameter store.

heitorlessa commented 4 years ago

cc @bahrmichael @Nr18 @keithrozario @michaelbrewer - Would love to get your inputs on this as we scope this new utility

heitorlessa commented 4 years ago

As we will depend on Boto3 and possibly other libraries, as we grow our utilities for Powertools.... it's worth thinking whether customers would be fine in having another PyPi package just for utilities

e.g.

from aws_lambda_powertools_utilities import get_parameter, validate, boto_tuner, requests_tuner...

UPDATE: An alternative could be bringing a few Kilobytes into Powertools with new utilities, but in order to use some of them there would be a conscious explicit decision of bringing extra dependencies like - pip install aws-lambda-powertools[boto]

UPDATE 2: Well, X-Ray SDK brings botocore (47M) in anyway, so if we were to suggest boto3 as an extra dependency for certain utilities that will add 1M extra which is negligible in comparison. Creating a separate lib to account for the extra 1M doesn't worth the operational cost as a whole - Docs, Changelog, another package, etc.

This means we can do:

from aws_lambda_powertools.utilities import ...

Nr18 commented 4 years ago

@heitorlessa in the project that I am currently working on we are investigating if we could use the aws_lambda_powertools and seeing this makes me happy because we wrote a small method that looks like the get_secret utility. So let me give my 2 cents of the choices we made on that as input for this.

We have the following:

def get_secret(secret_name: str) -> dict:
    if not secret_name:
        raise ValueError("Invalid secret_name given!")

    try:
        get_secret_value_response = client.get_secret_value(SecretId=secret_name)
    except ClientError as e:
        if e.response["Error"]["Code"] == "DecryptionFailureException":
            raise e
        elif e.response["Error"]["Code"] == "InternalServiceErrorException":
            raise e
        elif e.response["Error"]["Code"] == "InvalidParameterException":
            raise e
        elif e.response["Error"]["Code"] == "InvalidRequestException":
            raise e
        elif e.response["Error"]["Code"] == "ResourceNotFoundException":
            raise e
    else:
        if "SecretString" not in get_secret_value_response:
            raise ValueError("Expected to find a SecretString!")

        return json.loads(get_secret_value_response["SecretString"])

It's not that fancy I believe it's largely the code that secrets manager supplies as a sample, but the reason for us to put it in a separate file and method was so that we could use it like this:

def handler(event: dict, context: LambdaContext) -> None:
    secret = get_secret(os.getenv("MY_SECRET", ""))

# The following is a snippet from our unit tests
from src.my_lambda.function import index

def test_handler(monkeypatch) -> None:
    monkeypatch.setenv("MY_SECRET", "my-secret-key")
    mock_get_secret = MagicMock(return_value={})
    monkeypatch.setattr(index, "get_secret", mock_get_secret)
    index({}, LambdaContext())
    mock_get_secret.assert_called_with("my-secret-key")

Previously we used client.get_secret_value(SecretId=secret_name) but that requires you to use moto (which is slow) or the stubber (makes your tests really complex especially when you have multiple clients for different AWS services)

We only implemented the SecretString value in in our use-case it will always be a JSON payload but that does not have to be so you might want to consider doing something like:

from aws_lambda_powertools.utilities import get_secret, get_json_secret, get_binary_secret
# or
from aws_lambda_powertools.utilities import get_secret

get_secret("MySecret").json
get_secret("MySecret").text
get_secret("MySecret").binary

Because you typically will have 1 lambda function doing one thing I don't necessarily see the need to be able to fetch multiple secrets but there could be a use-case a lambda is invoked and needs to get credentials to read something from an API and then it needs a different set of credentials to write to another API. (Think this is typically needed when you use a scheduled event rule to keep something in sync for example)

For the parameter store, however, I do see the need to fetch all values recursively. Let say you would have a parameter story structure as following:

/Services/[ApplicationId]/Name
/Services/[ApplicationId]/Version
/Services/[ApplicationId]/Endpoint
/Services/[ApplicationId]/Dependencies/[OtherApplicationId]

So when you run get_parameters("/Services", recursive=True) what would you expect back? A list of dicts for each ApplicationId?

From a usage perspective it would be nice to be able to do something like:

application_id = foo
application_params = get_parameters(f"/Services/{application_id}", recursive=True)
print(application_params.Name)
print(application_params.get("Name", "No name found"))

nmoutschen commented 4 years ago

Hey @Nr18 ! Thanks a lot for the (very detailed) feedback here. 😄

On the return type, I have questions if someone is doing something like this:

value = get_parameter("some-parameter")
print(value.json["a"])
print(value.json["b"])

In this case, should we cache the output from json.loads() or should we re-compute it each time? Since we're already caching and people will call this over and over, there might be some benefits to cache the results. We could also add an argument to get_parameter to specify the type, e.g.:

get_parameter("some-parameter", format="text") # Default, returns a string
get_parameter("some-parameter", format="json") # json.loads() the value
get_parameter("some-parameter", format="binary") # base64.b64decode() the value

Agree on the get_parameters. However, with the properties (params.Name), how would it look like for nested values? Here's how I'm thinking about it:

params.Dependencies.OtherAppId

I quite like using params.Name instead of params["Name"], as we could use the format="json" from before with get_parameters too.

nmoutschen commented 4 years ago

Another question that popped in the PR (ping @jplock): should we decrypt data or not? I'm tempted to say yes from an ease of use, but I'm worried about the security implication to have something that will decrypt data without action.

E.g. developers could forget that this is sensitive information and accidentally leak it because they didn't explicitly decrypt it themselves.

keithrozario commented 4 years ago

I did something very similar to this, a few months back:

https://github.com/keithrozario/lambda-cache. It looks something like this:

from lambda_cache import ssm

@ssm.cache(parameter='/production/app/var', max_age_in_seconds=300)
def handler(event, context):
    var = getattr(context,'var')
    response = do_something(var)
    return response

I used a decorator with parameters to inject the parameter/secret into the context (because unlike event, context is the same regardless of what triggers the function). But, I'm not sure if it's 100% the best way to do this. It does allow us to do interesting things though -- like decorator stacking, and even multiple parameter caching.

My thoughts:

A boto3 dependency isn't a big deal -- boto3 is packaged into lambda anyway, and we don't need the latest version of boto for this to work. An underlying assumption of powertools, is that it should only be used in Lambda functions.
I agree that there aren't many use-cases for this. You can place the parameter in environment variables and/or call during the init/bootstrap time. There are only a few use-cases that need parameters refreshed every seconds.
We should decrypt any encrypted strings -- control for this should be at the IAM role level. Powertools should help make it easier for folks to use Lambda.
It is available in Middy though -- and it's good to have this as a tool in the arsenal for Python 👍
If this RFC is approved, I'd definitely loved to be involved, but we should definitely expand it to be a generic cache-ing functionality ... I was thinking along the lines of caching a file from S3 as well. Ultimately you can specify a time-frame before the lambda function will go back to source for the original data.

Nr18 commented 4 years ago

So not sure if it should be part of this or that it's more a project-wide discussion since python is moving to a more type hinted language every release it would help to provide typing that would help the developer both in the actual functions but maybe more importantly when writing unit tests.

If a decorator changes the context object it would be great if you IDE helps you figure out what it does (No expert on this area) instead of having to read through samples that are potentially outdated if they already exist.

Love the caching idea of a secret/parameter btw it saves a few calls and would increase invocation times.

nmoutschen commented 4 years ago

I'm not a huge fan of using the Lambda context for this. This has a few issues:

What if I'm using the parameter somewhere else than in the function handler?
What if fetching the parameter value is conditional? E.g. 99% of invocations don't need that parameter, but some do. This would overfetch by design.
What if AWS Lambda suddenly changes the context and uses the value that someone was using for their parameter?
How would we handle multiple parameters in a nice way?

On making it a generic caching system out of this, that could be a good idea! That'd drastically increase the scope and potentially help way more use cases. I just have a few concerns on expanding the scope too much. People could use DynamoDB or a relational DB to store parameters and want to retrieve them. However, when thinking about S3, some people might want to pull 100s of MBs and put that into /tmp, which I feel is out of scope for this.

We could make a generic parameter retrieval system, accepting parameter store providers. This way, people could make their own parameter store providers if they want. Then we could provide ones for common cases (e.g. SSM Parameter Store, Secrets Manager, DynamoDB, etc.). I'd keep it to things that we can pull from boto3, though, to not add dependencies on Postgres/MySQL/etc. libraries.

By the way, on boto3, we are already pulling botocore indirectly through the X-Ray SDK. boto3 doesn't add much there compared to botocore so I think it's fine to have a direct dependency.

Nr18 commented 4 years ago

@nmoutschen I thought about it and realized that we typically use SQS in front of the Lambda function and we have some plumming (which might be a good candidate to be done by the powertools 💪, so if you agree I can try to write an RFC for that) and it looks something like this:

sqs = SQS(os.getenv("EVENT_SOURCE"))

def handler(event: dict, context: LambdaContext) -> None:
    # Run the given callback method for each message
    sqs.process_messages(event["Records"], sqs_callback)

def sqs_callback(message: str, attributes: dict) -> bool:
    # get_secret() or get_parameter()
    # Process your message here
    return True # False or an Exception would not delete the specific message

So the reason why we build this is that we had a lot of duplicate code of handling the messages and making sure it gets deleted and that makes the tests of the function somewhat complex and harder to maintain. With ☝️ you typically test that if the handler is called are the records passed to the sqs.process_messages method and you only need to test the sqs_callback method which contains your actual business logic.

That solution would not work if the secrets would come in the context and since you might get 10 messages you also need caching at least in the same invocation but when you get 10 messages at the same time from a queue you probably have a lot of invocations so cache in the function warm start would make sense.

I like the idea of the generic parameter retrieval system as long as the parameter store and secrets manager are included so you don't need to write your own.

Regarding the boto3 being included I have not seen a clear use case where I do not include it, so the powertools might not use it but the business logic in the function typically is so I am definitely fine with pulling it in by default especially when you consider: botocore (47M) vs boto3 (1M). (We always include boto3 in a lambda layer anyway)

nmoutschen commented 4 years ago

@Nr18 By the way, for SQS, there's already an RFC open but your input would be much appreciated there! 😄 https://github.com/awslabs/aws-lambda-powertools-python/issues/92

nmoutschen commented 4 years ago

Following the discussion here, I've done a few thing in the PR:

Create a parameter provider class (aws_lambda_powertools.utilities.BaseProvider) and classes for common use-cases (Parameter Store, Secrets Manager, DynamoDB).
Added helper functions (get_parameter and get_parameters for the Parameter Store, and get_secret for Secrets Manager) for the common cases.

The implementation for a specific provider is fairly straightforward, and much of the caching/transformation logic is handled by the BaseProvider. For example, for the SSM Parameter store (SSMProvider().get()):

    def _get(self, name: str, **kwargs) -> str:

        # Load kwargs
        decrypt = kwargs.get("decrypt", False)

        return self.client.get_parameter(Name=name, WithDecryption=decrypt)["Parameter"]["Value"]

alexcasalboni commented 4 years ago

@nmoutschen great discussion and the PR looks great!

Have you considered adding an "automatic retry" decorator as well? I implemented that here: https://github.com/alexcasalboni/ssm-cache-python/blob/master/ssm_cache/cache.py#L143

The idea was to simplify the invalidation of parameters that might change at run-time without forcing a short cache TTL, based on a specific exception/error. So your cache is valid forever, until you get the expected error.

Something like this this:

from ssm_cache import SSMParameter
from my_db_lib import Client, InvalidCredentials  # pseudo-code

param = SSMParameter('my_db_password')
my_db_client = Client(password=param.value)

# this callback is called when InvalidCredentials is raised
# it will just re-initialize the db client with the new password
def on_error_callback():
    my_db_client = Client(password=param.value)

@param.refresh_on_error(InvalidCredentials, on_error_callback)
def read_record(is_retry=False):
    return my_db_client.read_record()

def lambda_handler(event, context):
    return {
        'record': read_record(),
    }

I know that for this specific use case we'd rather recommend using AWS Secrets Manager to handle db host/password & automatic rotation, but there are other cases where you want to be able to automatically re-fetch a parameter and retry (instead of a lot of try-catch and manual invalidation).

nmoutschen commented 4 years ago

Closing as we release this feature in 1.3.0.