aws-samples / amazon-textract-serverless-large-scale-document-processing

Process documents at scale using Amazon Textract
Apache License 2.0
328 stars 165 forks source link

Docproc does not return a jsonifyable object, therefore the lambda fails and will retry #14

Open vincentclaes opened 4 years ago

vincentclaes commented 4 years ago

Description of changes:

docproc.py does not return a jsonifyable object -> therefore it will raise an error. The messages get dumped to the next SQS successfully but because the lambda throws an error in the end it will retry. docproc.py lambda will retry the same documents over and over again. I believe the retry on the dynamodb stream event of docproc is set to 10000.

if I added a jsonifyable return value all my documents were processed like they should and not just the first 200 to 300 documents that were retried over and over again.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.