wenyu1009 / RTSRN

MIT License
16 stars 2 forks source link

Did the language model have been pre-trained? #7

Open TriplePool opened 6 months ago

TriplePool commented 6 months ago

Thank you for your excellent work. I notice that the language model used in your code is randomly initialized. And I can not re-implement the reported single stage results with the setting in the README. CUDA_VISIBLE_DEVICES=0 python3 main.py --arch="rtsrn" --test_model="CRNN" --batch_size=48 --STN --sr_share --gradient --use_distill --stu_iter=1 --vis_dir='test' --mask --triple_clues --text_focus --lca My results: {'accuracy_avg': 0.5236999999999999, 'acc_list': {'easy': 0.6467, 'medium': 0.5372, 'hard': 0.3872, 'epoch': 441}, 'psnr_avg': 21.116845333333334, 'ssim_avg': 0.7732589999999999, 'epoch': 441} Is the performance drop due to not using a pre-trained language model? If so, could you please provide pre-trained language model weights?

wenyu1009 commented 6 months ago

Hi, Our work is based on C3-STISR, but they only provided the core code without the pth for the language model. Therefore, our language model is trained from scratch with random initialization.

The decrease in reproducibility results from two reasons:

  1. The code on GitHub has been modified multiple times, leading to issues during the modification process. We will provide the most original code.

  2. Errors in parameters or fluctuations during reproduction. The logs of our training are in the log folder on GitHub for reference.