I get surprisingly low results when running a test on cord-v2, in order to reproduce the paper's results.
I use the following command line python3 test.py --pretrained_model_name_or_path "naver-clova-ix/donut-base-finetuned-cord-v2" --dataset "naver-clova-ix/cord-v2" --split "test" and I get the following results:
Total number of samples: 100, Tree Edit Distance (TED) based accuracy score: 0.17636126902335467, F1 accuracy score: 0.1259655377302436, far from the expected 90% TED and 84% F1 score.
I haven't changed anything in the code and I run my tests on a single V100 GPU. Have I missed something ?
Hello,
I get surprisingly low results when running a test on
cord-v2
, in order to reproduce the paper's results.I use the following command line
python3 test.py --pretrained_model_name_or_path "naver-clova-ix/donut-base-finetuned-cord-v2" --dataset "naver-clova-ix/cord-v2" --split "test"
and I get the following results:Total number of samples: 100, Tree Edit Distance (TED) based accuracy score: 0.17636126902335467, F1 accuracy score: 0.1259655377302436
, far from the expected 90% TED and 84% F1 score.I haven't changed anything in the code and I run my tests on a single V100 GPU. Have I missed something ?
Thanks in advance