Closed walmsles closed 3 years ago
Great catch, indeed an oversight. Would you mind sending this fix ;)?
I was planning to include - bring your own serializer - as part of this release but ran out of time
On Sun, 22 Aug 2021 at 07:42, walmsles @.***> wrote:
I was curious about the idempotent implementation and the general property of REST API bodies being un-ordered lists of JSON attributes. A REST API should be resilient to consumers and pragmatic in its approach to reading data supplied by clients calling them.
I created a simple JSON REST API using serverless and Powertools and implemented the Idempotency utility as follows:
import jsonimport osfrom datetime import datetimefrom aws_lambda_powertools import Logger, Tracerfrom aws_lambda_powertools.utilities.idempotency import DynamoDBPersistenceLayer, idempotent, IdempotencyConfig logger = Logger()tracer = Tracer() table_name = os.environ['TABLE']persistence_layer = DynamoDBPersistenceLayer(table_name=table_name)config = IdempotencyConfig(event_key_jmespath="body") @@.**@.(persistence_store=persistence_layer, config=config)def handler(event, context): body = { "message": f"Data Created: {datetime.now()}", "input": event['body'] }
logger.info('Processing API call - creating the data now!') logger.info(event['body']) response = { "statusCode": 200, "body": json.dumps(body) } return response
I deployed this to AWS and called the API with input as follows:
Test 1
{ "data": "test message 1", "more_data": "more data 1" }
Test 2 Note JSON attributes are submitted in different order which is OK for a REST API.
{ "more_data": "more data 1", "data": "test message 1" }
My expectation is that both these payloads will result in the exact same idempotent response since the payloads are essentially the same so far as REST APi conventions go.
The result I actually get is as follows:
Test 1 Result from first call to the API
{ "message": "Data Created: 2021-08-22 05:14:01.524104", "input": "{\n \"data\": \"test message 1\",\n \"more_data\": \"more data 1\"\n}" }
Test 2 Note: This is a different response than the first call to the API - I was expecting the same idempotent response.
{ "message": "Data Created: 2021-08-22 05:14:42.554149", "input": "{\n \"more_data\": \"more data 1\",\n \"data\": \"test message 1\"\n}" }
What were you trying to accomplish? Implement Idempotency over a simple API using the entire API Body as the key for recognising Idempotency. Expected Behavior
When calling my API with Test 1 payload and test 2 payload I am expecting the exact same idempotent response. Current Behavior
Instead I am getting 2 different responses since the order of attributes plays a role in the idempotency key Hashing algorithm used by this utility. Possible Solution
Suggest looking at using the sort_keys=True option for json.dumps when generating the Hash Steps to Reproduce (for bugs)
- Create lambda API using code similar to code in this report.
- send payload: {"data":"test message 1","more_data":"more data 1"}
- call API again with same inputs but attribute order changed: {"more_data":"more data 1","data":"test message 1"}
- Expect same idempotent response from step 4 as in step 3
Environment
- Powertools version used: 1.20.0
- Packaging format (Layers, PyPi):
- AWS Lambda function runtime: Python 3.8
- Debugging logs
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/awslabs/aws-lambda-powertools-python/issues/638, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAZPQBCP5QKEPKD6SKJ7GALT6CE3PANCNFSM5CSS5JKQ .
@heitorlessa Let me have a go - The test for this one are slightly scary but let me take a look :-)
Not to worry, I know it can be intimidating as it's the most complex part of Powertools - Just got a test for that while I wait for my flight ;) Fix coming soon
Waiting for CI checks then releasing it ;)
That's how a test for that looks like
def test_idempotent_data_sorting():
# Scenario to validate same data in different order hashes to the same idempotency key
data_one = {"data": "test message 1", "more_data": "more data 1"}
data_two = {"more_data": "more data 1", "data": "test message 1"}
# Assertion will happen in MockPersistenceLayer
persistence_layer = MockPersistenceLayer("test-func#" + hashlib.md5(json.dumps(data_one).encode()).hexdigest())
# GIVEN
@idempotent_function(data_keyword_argument="payload", persistence_store=persistence_layer)
def dummy(payload):
return {"message": "hello"}
# WHEN
dummy(payload=data_two)
@heitorlessa all the MockPersistenceLayer setups need to include 'sort_keys=True' for all the existing tests as well. I had progressed to making the existing tests fail Will wait for your fix to come through
Fixed it - it was actually needed in conftest too as there were plenty of json.dumps
for dummy hash idempotency keys. Turns out I forgot how long I had to walk to find my gate ;D
For the MockPersistenceLayer, I didn't include the sort_keys up there deliberately, so I can pass payload=data_two
down there to ensure they're sorted in the implementation
Available in PyPi now as 1.20.1 - Lambda Layers might take up to 5 minutes to complete ;)
Thank you so so much again @walmsles !
I was curious about the idempotent implementation and the general property of REST API bodies being un-ordered lists of JSON attributes. A REST API should be resilient to consumers and pragmatic in its approach to reading data supplied by clients calling them.
I created a simple JSON REST API using serverless and Powertools and implemented the Idempotency utility as follows:
I deployed this to AWS and called the API with input as follows:
Test 1
Test 2 Note JSON attributes are submitted in different order which is OK for a REST API.
My expectation is that both these payloads will result in the exact same idempotent response since the payloads are essentially the same so far as REST APi conventions go.
The result I actually get is as follows:
Test 1 Result from first call to the API
Test 2 Note: This is a different response than the first call to the API - I was expecting the same idempotent response.
What were you trying to accomplish? Implement Idempotency over a simple API using the entire API Body as the key for recognising Idempotency.
Expected Behavior
When calling my API with Test 1 payload and test 2 payload I am expecting the exact same idempotent response.
Current Behavior
Instead I am getting 2 different responses since the order of attributes plays a role in the idempotency key Hashing algorithm used by this utility.
Possible Solution
Suggest looking at using the sort_keys=True option for json.dumps when generating the Hash
Steps to Reproduce (for bugs)
Environment