Thank you so much for the great contribution to the community. I'm trying to finetune the model to adapt with other language such as Vietnamese but I found it quite hard to make the algorithm to converge. Specifically, I've process a subset of 2500images from pubtable dataset and the output@epoch 5 is something like this:
Is it normal to have word_acc=0.0000 at early epoch?
Could you give some suggestions how many images should we finetune the network or should we use the pretrained weight to get a decent result...?
Dear authors,
Thank you so much for the great contribution to the community. I'm trying to finetune the model to adapt with other language such as Vietnamese but I found it quite hard to make the algorithm to converge. Specifically, I've process a subset of 2500images from pubtable dataset and the output@epoch 5 is something like this:
Epoch(val) [5][2489] 0_[table_master_dataset]_wordacc: 0.0000, 0[table_master_dataset]_word_acc_ignorecase: 0.0000, 0[table_master_dataset]_word_acc_ignore_casesymbol: 0.0000, 0[table_master_dataset]_charrecall: 0.1952, 0[table_master_dataset]_charprecision: 0.0682, 0[table_master_dataset]1-N.E.D: 0.3024, 0[table_master_dataset]TEDS: 0.0550, 0[table_master_dataset]_Time eval: 1399.9359
Is it normal to have word_acc=0.0000 at early epoch? Could you give some suggestions how many images should we finetune the network or should we use the pretrained weight to get a decent result...?