aws-samples / amazon-rekognition-serverless-large-scale-image-and-video-processing

Process images and videos at scale using Amazon Rekognition
MIT No Attribution
35 stars 25 forks source link

Large scale image and video processing with Amazon Rekognition

This reference architecture shows how you can extract insights from images and videos at scale using Amazon Rekognition. Below are some of key attributes of reference architecture:

Architecture

Architecture below shows the core components.

Image pipeline (Use Sync APIs of Amazon Rekognition)

  1. The process starts as a message is sent to an Amazon SQS queue to analyze an image.
  2. A Lambda function is invoked synchronously with an event that contains queue message.
  3. Lambda function then calls Amazon Rekognition and store the result in different datastores for example DynamoDB, S3 or Elasticsearch.

You control the throughput of your pipeline by controlling the batch size and lambda concurrency.

Video pipeline (Use Async APIs of Amazon Rekognition)

  1. The process starts when a message is sent to an SQS queue to analyze a video.
  2. A job scheduler lambda function runs at a certain frequency for example every 5 minutes and poll for messages in the SQS queue.
  3. For each message in the queue it submits an Amazon Rekognition job to process the video and continue submitting these jobs until it reaches the maximum limit of concurrent jobs in your AWS account.
  4. As Amazon Rekognition is finished processing a video it sends a completion notification to an SNS topic.
  5. SNS then triggers the job scheduler lambda function to start the next set of Amazon Rekognition jobs.
  6. SNS also sends a message to an SQS queue which is then processed by a Lambda function to get results from Amazon Rekognition and store them in a relevant dataset for example DynamoDB, S3 or Elasticsearch.

Your pipeline runs at maximum throughput based on limits on your account. If needed you can get limits raised for concurrent jobs and the pipeline automatically adapts based on new limits.

Image and video processing workflow

Architecture below shows overall workflow and few additional components that are used in addition to the core architecture described above to process incoming images/videos as well as large backfill.

Process incoming images/videos workflow

  1. An image or video gets uploaded to an Amazon S3 bucket. It triggers a Lambda function which writes a task to process the image/video to DynamoDB.
  2. Using DynamoDB streams, a Lambda function is triggered which writes to an SQS queue in one of the pipelines.
  3. Images/videos are processed as described above by "Image Pipeline" or "Video Pipeline".

Large backfill of existing images/videos workflow

  1. Images/videos already exist in an Amazon S3 bucket.
  2. We create a CSV file or use S3 inventory to generate a list of images/videos that needs to be processed.
  3. We create and start an Amazon S3 batch operations job which triggers a Lambda for each object in the list.
  4. Lambda writes a task to process each image/video to DynamoDB.
  5. Using DynamoDB streams, a Lambda is triggered which writes to an SQS queue in one of the pipelines.
  6. Images/videos are processed as described above by "Image Pipeline" or "Video Pipeline".

Prerequisites

Setup

Deployment

Test incoming images/videos

Test existing backfill images/videos

Source code

Modify source code and update deployed stack

Cost

Delete stack

License

This library is licensed under the MIT-0 License. See the LICENSE file.