irods / irods_client_aws_lambda_s3

1 stars 3 forks source link

handle SQS batch size greater than 1 #12

Open trel opened 4 years ago

trel commented 4 years ago

The Lambda could parse and then iterate over N messages in a single batch from SQS.

This would reduce Lambda instantiations and cost.

trel commented 4 years ago

From https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html:

A large batch size can improve efficiency for workloads that are very fast or have a lot of overhead. However, if your function returns an error, all items in the batch return to the queue.

This suggests the lambda: 1) must be able to 'unwind' any successful registrations in the face of an error within the batch OR 2) not 'unwind', return the real error and return the items to the queue, and then not return an error the next time if earlier registrations have already made it into the catalog (aka 'CATALOG_ALREADY_HAS_ITEM_BY_THAT_NAME')

If option 2, what prevents whatever issue that caused the failure the first time from also failing in the next batch? Or maybe that's fine... all the well-behaved items will have already been registered?

trel commented 3 years ago

Alternatively, we embrace the lack of support for batch sizes greater than 1...


If the length of 'Records' is greater than 1:

This will prevent the current state of:

trel commented 3 years ago

Being able to support arbitrary batch sizes will require the iRODS server to handle multiple data object operations of 'register' and 'unregister', atomically (new API endpoint). And the python-irodsclient will need to learn the new server API call...

trel commented 1 year ago

Since 4.3.0 - the iRODS server now accepts atomic_apply_database_operations... but we haven't tried/tested it with the lambda yet.

However...

A new wrinkle... AWS now supports batchItemFailures...

Since Nov 2021... https://aws.amazon.com/about-aws/whats-new/2021/11/aws-lambda-partial-batch-response-sqs-event-source/

The official docs... https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html#services-sqs-batchfailurereporting

And a writeup from the community... https://betterprogramming.pub/sqs-batch-processing-with-reporting-batch-item-failures-6c405c852401