aws-powertools / powertools-lambda-typescript

Powertools is a developer toolkit to implement Serverless best practices and increase developer velocity.
https://docs.powertools.aws.dev/lambda/typescript/latest/
MIT No Attribution
1.58k stars 139 forks source link

Bug: Subsequent Requests without a Idempotency Key Return the Same Result as the First Request #1501

Closed brianhyder closed 1 year ago

brianhyder commented 1 year ago

Expected Behaviour

When an idempotency key is not provided as part of the request and the throwOnNoIdempotencyKey configuration attribute is false, the idempotency package should proceed as if it is a unique request regardless of how many times the request is sent.

Current Behaviour

When sending a request for the first time, with no idempotency key, the call succeeds and returns the proper status code along with the payload. When sending a subsequent request, with no idempotency key, the response is the same as the response of the first request.

A cache key is still being generated but because there is no idempotency key, it is not unique.

Code snippet

const config = new IdempotencyConfig({
   hashFunction: 'md5',
   useLocalCache: false,
   expiresAfterSeconds: 3600,
   throwOnNoIdempotencyKey: false,
   eventKeyJmesPath: 'headers.idempotency-key'
});
const persistenceStore = new DynamoDBPersistenceLayer({
   tableName: 'idempotency_store',
   keyAttr: 'key'
});
makeHandlerIdempotent({ config, persistenceStore });

Steps to Reproduce

  1. Create a request where the middleware is configured as above, no idempotency key, and the response is a randomly generated value.
  2. Send the request and notate the response
  3. Verify that the record was persisted in the dynamo table
  4. Send the same request as in step 1 and note the response

The response of the second request is the same as the first response when it should be a different value since the endpoint echos random values.

Possible Solution

The issue seems to be here. When no idempotency key is passed, the value of data is null. Therefore the hash always gets calculated from the raw string #null.

private getHashedIdempotencyKey(data: Record<string, unknown>): string {
    if (this.eventKeyJmesPath) {
      data = search(data, this.eventKeyJmesPath);
    }

    if (BasePersistenceLayer.isMissingIdempotencyKey(data)) {
      if (this.throwOnNoIdempotencyKey) {
        throw new Error('No data found to create a hashed idempotency_key');
      }
      console.warn(
        `No value found for idempotency_key. jmespath: ${this.eventKeyJmesPath}`
      );
    }

    return `${this.idempotencyKeyPrefix}#${this.generateHash(
      JSON.stringify(data)
    )}`;
  }

Powertools for AWS Lambda (TypeScript) version

latest

AWS Lambda function runtime

18.x

Packaging format used

npm

Execution logs

No response

dreamorosi commented 1 year ago

Hi @brianhyder, the throwOnNoIdempotencyKey is a key that you can pass to the config object to tell the utility to throw an error if the specified idempotency key is not present in the request - which is what's happening in your example, copying the description of the parameter from the Python docs:

Raise exception if no idempotency key was found in the request

If I'm understanding correctly the flow that you are describing then both requests are generating the same hash because they are both using null as input which will result in the same hash (i.e. not random). With that in mind, the utility is correctly returning the same response since the input and hash are the same.

The idea behind that parameter is to halt execution in case the specified idempotency key is not present in the request, either because the input event didn't have the expected shape or because the JMESPath expression used is wrong/not compatible with that event. The default behavior would be to simply log a warning and continue the execution since the throwOnNoIdempotencyKey defaults to false.

The parameter is normally set to false because by default the whole event/payload is used to generate the hash. Whenever using JMESPath expressions you're telling the utility to use a subset of the payload, in that case it'd make more sense to enable the throwOnNoIdempotencyKey to avoid the same situation that you just encountered.

To sum up, I'll have to double check before confirming it 100% (will do so shortly), but I believe this is working as intended since your JMESPath generates an empty subset and you also told the utility to not throw an error when that happens.

dreamorosi commented 1 year ago

To expound a bit more on the above, whenever you are using the middleware or otherwise making the whole Lambda handler idempotent then you most probably always want the full operation to be idempotent and also stop execution if the utility isn't able to generate an hash.

In other cases, for instance if you are processing a number of records (i.e. SQS messages), then the idempotent operation would be scoped to each message. When doing so you might want to process the messages in batch (i.e. for payload of messages) and also apply a JMESPath expression or be dealing with records of different shapes/formats.

In those cases the throwOnNoIdempotencyKey option allows you to tell the utility whether you want to stop everything if a message doesn't have a key (and so it's processing is no longer idempotent) or instead continue processing because your use case might allow for this type of occurrences.

brianhyder commented 1 year ago

The use case I have is: an HTTP endpoint where the idempotent key is optional. The caller can choose to provide it or not. In this case, the middleware doesn't allow for normal execution to continue because it generates the same hash even though the requests themselves are meant to be unique.

If the code provides the correct behavior, I think this leaves me with a few of options:

  1. Make the idempotent key a requirement for the endpoints that use it
  2. Upstream from the lambda (API Gateway or authorizer), create a unique idempotent key if one is not provided by the caller
  3. Wrap the idempotency middleware such that I can override the before and after parts to essentially not execute when the idempotency key is not provided.

For issues like this where I'm not sure if it is a bug or not, is there a preferred way to submit those?

dreamorosi commented 1 year ago

To answer your second question: generally speaking - especially at this stage issues are fine whenever you suspect that there's a bug or something is not working as intended. If you are unsure or you think it'd be more of a question/discussion around how to best use a feature or similar you could open a Discussion.

Either way, it's a two-way door decision: we can convert issues into discussions and vice versa, so it's fine really.

Regarding your main question about how to model the payload and apply idempotency, let me loop in some of my teammates and get back to you.

am29d commented 1 year ago

Hey Brian, thanks for brining up this discussion. I see your use case with optional idempotency key that you want to pass to your API consumers. In the current design we intend to provide the functionality for use cases where the idempotency key is required, this is also the case in python implementation.

I completely agree with your statement that once you configure throwOnNoIdempotencyKey to false you have a clear intention that missing key is ok, thus it's a no-op for the idempotency functionality and there should be no information in the persistence store.

We will add the functionality to ignore the request and process it without idempotency.

brianhyder commented 1 year ago

Awesome, thank you all for the discussion and the help.

heitorlessa commented 1 year ago

Chiming in from Python Powertools to say thank you @brianhyder uncovering this up. We're coordinating a fix across all implementations as we speak now.

dreamorosi commented 1 year ago

Just for the record, I initially temporarily removed the bug label because the current implementation was in line with the reference one.

However after discussing this internally your question made us rethink the flow, thank you for taking the time to open the issue and helping to improve the utility for everyone!

heitorlessa commented 1 year ago

@dreamorosi @am29d and @jeromevdl @mriccia - we've just added a sequence diagram to explain this behaviour, feel free to copy that commit as-is before you release.

Python is making a release after lunch

image
github-actions[bot] commented 1 year ago

⚠️ COMMENT VISIBILITY WARNING ⚠️

Comments on closed issues are hard for our team to see. If you need more assistance, please either tag a team member or open a new issue that references this one. If you wish to keep having a conversation with other community members under this issue feel free to do so.