clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.52k stars 443 forks source link

custom json schema - ASAP #290

Open xdevfaheem opened 4 months ago

xdevfaheem commented 4 months ago

is it possible to train the model to generate structured output with custom json schema? Pleases help me ASAP

xdevfaheem commented 4 months ago

@felixvor

felixvor commented 4 months ago

It would be very interesting to see how a complicated json structure can impact the model performance, but to make it short: Sure it is possible, you can pretty much fine tune the model to generate any text you want!

In pre-training the model only learns to generate OCR text strings based on images (no json at all), the example notebooks then use the pre-trained weights to fine-tune the model on various schemas (including json) for classification, entity extraction and question answering. I would recommend to follow the conventions of converting your json into an html-like structure and converting from/to json before/after calling the model but all of that is covered in the examples as well.

Good luck with your experiments, keep us posted about your results!