Document AI s a term that has become popular over the last 3 years. It defines machine learning models, tasks, and techniques to classify, parse, and extract information from documents in digital and print forms, like invoices, receipts, licenses, contracts, and business reports.
This repository contains different example and tutorials on how to get started with Document AI and Transformers. Below you can also find a compendium of available models, tasks, datasets and other resources.
Training
Inference
Data-processing
Demos/Spaces
Community:
popular models are layoutlm.... and Donut which we will use today get a first impression of how you can build you own document AI System using Hugging Face Transformers.
Below you can find a table of the currently available Transformers models, who are achieving state-of-the-art performance on Document AI tasks.
Document AI includes the following use cases and tasks:
Dataset | Task | Hugging Face Datasets |
---|---|---|
SROIE | document parsing | darentang/sroie |
RVL-CDIP | document classification | rvl_cdip |
XFUND | document parsing | ranpox/xfund |
FUNSD | document parsing | nielsr/funsd |
CORD | information extraction/parsing | naver-cola-ix/cord-v2 |
DocVQA | visual question answering | load manually |
WildReceipt | document parsing | Theivaprakasham/wildreceipt |
TableBank | table detection/layout analysis | load manually |
DocBank | table detection/layout analysis | load manually |
ReadingBank | table detection/layout analysis | load manually |
EATEN | document parsing | load manually |
PubLayNet | table detection/layout analysis | jordanparker6/publaynet |
ICDAR2019_cTDaR | table detection/layout analysis | load manually |