Two types of documents in one model?

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

MIT License

5.52k stars 443 forks source link

Hello!

We are trying to read both data from electronic invoices (typically well structured PDF files with good information) and from cashier receipts (like the ones in CORDS dataset).

What would be best approach to handle both types of documents? My approach now is to train 3 models: document classification, invoice parser and cashier receipt parser. And then first run document classification and then decide what model to run next.

My wondering is if I could combine everything into one model. Invoices has some additional fields (due date for instance) - but other than the additional fields - all other fields are same. Is it possible for instance to add "class" field into the data - and then train on all documents in one model?

clovaai / donut

Two types of documents in one model? #256