clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.75k stars 466 forks source link

Why am I not able to achieve the same results when using the CORD training code provided by you? #210

Closed Masterchenyong closed 1 year ago

Masterchenyong commented 1 year ago

python train.py --config config/train_cord.yaml \ --pretrained_model_name_or_path "/home/kas/chenyong/HuggingFaceModels/others/donut-base" \ --dataset_name_or_paths '["naver-clova-ix/cord-v2"]' \ --exp_version "test_experiment"

The minimum loss value :10.5

image

dscrlc commented 10 months ago

have you solved this problem? i have same problem,logs here: Answer: TT20,000120,00020,000100,00080,000 Normed ED: 0.9756309834638817 Prediction: Answer: PKT TELOR/PERKEDEL26,000TERONG12,000PARU23,000SBL GR ATI/AMPLA20,000NESTLE 330 ML8,00089,0008,90097,900100,0002,1005.00 Normed ED: 0.9425587467362925

loss=11

with "windows10"/"distributed_backend=gloo" /"train_batch_sizes=4". I don't know where the problem lies, why the loss value is so high