clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.74k stars 466 forks source link

Model Consistently Mispredicting Specific Character in Invoice Number #260

Open Codedrainer opened 11 months ago

Codedrainer commented 11 months ago

Hello,

I’ve encountered an issue with the model's predictions on invoice numbers. My dataset consists of 800 training images, 100 testing images, and 100 validation images. The model has been trained successfully, yielding the following accuracy scores:

Total number of samples: 100 Tree Edit Distance (TED) based accuracy score: 0.9476799242424243 F1 accuracy score: 0.5213032581453634 Despite the promising TED accuracy of 94%, a detailed examination of the predictions revealed a persistent error. The model aims to parse documents containing a 15-digit alphanumeric invoice number. However, I observed that the model consistently mispredicts the third character from the end of the invoice number, interpreting a '2' as a '1'. This error was present in 94 out of the 100 tested images.

This misprediction is critical because an incorrect character in the invoice number renders the entire prediction inaccurate, thereby questioning the effectiveness of employing the AI model for this task.

I am seeking guidance or recommendations to improve the model's precision in predicting this specific character within the invoice number. Any assistance or suggestions would be immensely appreciated.

Thank you.