blackprotoss / GSDM

Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)
MIT License
47 stars 4 forks source link

Cannot reproduce the quantitative results for WordAcc #8

Open LonglongaaaGo opened 3 months ago

LonglongaaaGo commented 3 months ago

Hi @blackprotoss Thanks for your awesome work!

I recently tried to reproduce the WordAcc performance as mentioned in the paper Table 9. When I directly applied the crnn on the GT testing set on TII-ST. The Word ACC only has 40%, and Char ACC only has 70% When I directly input the GT text binary segmentation maps in the crnn. I can achieve 94.59 Word ACC and 99.11 Char ACC. The results somehow look weird. Could you give me some advice?

Thanks! https://github.com/meijieru/crnn.pytorch

blackprotoss commented 3 months ago

For each image, we convert the recognition results to lowercase letters for evaluation.

LonglongaaaGo commented 3 months ago

Hi @blackprotoss Thanks for the reply. Yes, we did the convert operation. But the performance is still not good. And the same for the CRNN, MORAN, and ASTER models. Moreover, we also tried the trocar-b and trocar-L. The trocr-b got the 8% lower than that showed in the paper, while trocr-L can get the same performance as mentioned in the paper.

Should I need to re-train these recognition models? Thanks for any suggestions!