Amazon Textract's advanced extraction features go beyond simple OCR to recover structure from documents: Including tables, key-value pairs (like on forms), and other tricky use-cases like multi-column text.
However, many practical applications need to combine this technology with use-case-specific logic - such as:
This solution demonstrates how Textract can be integrated with:
...on a simple example use-case: extracting vendor, date, and total amount from receipt images.
The design is modular, to show how this pre- and post-processing can be easily customized for different applications.
This overview diagram is not an exhaustive list of AWS services used in the solution.
The solution orchestrates the core OCR pipeline with AWS Step Functions - rather than direct point-to-point integrations - which gives us a customizable, graphically-visualizable flow (defined in /source/StateMachine.asl.json):
The client application and associated services are built and deployed as an AWS Amplify app, which simplifies setup of standard client-cloud integration patterns (e.g. user sign-up/login, authenticated S3 data upload).
Rather than have our web client poll the state machine for progress updates, we push messages via Amplify PubSub - powered by AWS IoT Core.
The Amplify build settings (in amplify.yml with some help from the Makefile) define how both the Amplify-native and custom stack components are built and deployed... Leaving us with the folder structure you see in this repository:
├── amplify [Auto-generated, Amplify-native service config]
├── source
│ ├── ocr [Custom, non-Amplify backend service stack]
│ │ ├── human-review [Human review integration with Amazon A2I]
│ │ ├── postprocessing [Extract business-level fields from Textract output]
│ │ ├── preprocessing [Image pre-check/cleanup logic]
│ │ ├── textract-integration [SFn-Textract integrations]
│ │ ├── ui-notifications [SFn-IoT push notifications components]
│ │ ├── StateMachine.asl.json [Processing flow definition]
│ │ └── template.sam.yml [AWS SAM template for non-Amplify components]
│ └── webui [Front-end app (VueJS, BootstrapVue, Amplify)]
├── amplify.yml [Overall solution build steps]
└── Makefile [Detailed build commands, to simplify amplify.yml]
NOTE |
For details on each component, check the READMEs in their subfolders! |
---|
If you have:
...then you can go ahead and click the button below, which will fork the repository and deploy the base solution stack(s):
From here, there are just a few extra (but not trivial) manual configuration steps required to complete your setup:
RekognitionModelArn
SSM Parameter to reference it.DefaultHumanFlowArn
SSM Parameter to reference it.Now you should be all set to upload images through the app UI, review low-confidence results through the Amazon A2I UI, and see the results!