FudanVI / benchmarking-chinese-text-recognition

This repository contains datasets and baselines for benchmarking Chinese text recognition.
MIT License
427 stars 52 forks source link

Some curiosity about the baseline training configuration. #18

Open icecream-Tnak opened 2 years ago

icecream-Tnak commented 2 years ago

Thank you for your contribution to the community, the creation of the Chinese recognition benchmark is critical to the advancement of the field.

I would like to know some more details about the training configuration of some baseline models for the sub-dataset "scene". Such as specific epoch, batchsize, lr, weight_decay, max_length, grad_clip, etc.

In particular, I noticed that the results of TransOCR reported in the paper (arXiv:2112.15093) are different from the results on GitHub. After the paper was submitted to arxiv, better experimental results were obtained based on different experimental hyperparameters?

Thank you again for your outstanding work and hope you can get back to me. Because I didn't find the specific configuration file, and the default parameters in TransOCR/main.py, such as epoch = 1000. Unfortunately, my current hardware cannot support such a long training time.