roatienza / deep-text-recognition-benchmark

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
Apache License 2.0
284 stars 57 forks source link

About the difference between the number of training iters in the paper and this Repo #3

Closed superPangpang closed 3 years ago

superPangpang commented 3 years ago

Thanks for your great work and source code ! The training epoch numbers in the paper Table 2 are 300, but there are 300000 iters in source code . Data augmentations in the code are very thorough, I think a longer training process is necessary. Which one is your experimental strategy? I do not know if you have done similar experiments that how many iters of training performance will be basically stable under your strong data augment setting. I look forward to your reply!

roatienza commented 3 years ago

300K is the correct one. Table 2 meant to be 300K iterations. 300K is CLOVA AI training protocol for their STR benchmark. In my experience, anything greater than 300K has little improvement.

superPangpang commented 3 years ago

thanks for your reply!

roatienza commented 3 years ago

Thanks. Closing...