clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.75k stars 466 forks source link

Donut Custom Model Training #149

Closed rajsaraiya009 closed 1 year ago

rajsaraiya009 commented 1 year ago

Hello I did Tried to Train and fine tune model based on my own dataset and I want to receive fields like address and date. But why am i not getting entity name in the output rather than that i get output like below if anyone can help me it will be great thanks. So below you can see i am getting like "text sequence" rather than address as entity. Output: {'predictions': [{'text_sequence': '18,SS 22/2 DANAGER AYA 4400 SELANGOR '}]}

YAOYI626 commented 1 year ago

what's your ground-truth looks like? is that the same format with what you're showing?

kushal-h commented 1 year ago

Hello I did Tried to Train and fine tune model based on my own dataset and I want to receive fields like address and date. But why am i not getting entity name in the output rather than that i get output like below if anyone can help me it will be great thanks. So below you can see i am getting like "text sequence" rather than address as entity. Output: {'predictions': [{'text_sequence': '18,SS 22/2 DANAGER AYA 4400 SELANGOR '}]}

Hi,

I wanted to train this for my own dataset. I wanted to know how did you train the model. Please provide me some documentation that you followed.

Thanks in advance.

rajsaraiya009 commented 1 year ago

@YAOYI626 I solved the issue. While doing inference you need to provide the data set name on which you trained the model than it will identify the entity. Thanks for the response.

rajsaraiya009 commented 1 year ago

@kushal-h you can read the readme file clearly all steps are mentioned or else you can visit link below https://towardsdatascience.com/ocr-free-document-understanding-with-donut-1acfbdf099be https://towardsdatascience.com/fine-tuning-ocr-free-donut-model-for-invoice-recognition-46e22dc5cff1

note: I have created my own data-set the links above will help you in understanding the training of the model.