Closed long8v closed 2 months ago
@wangsu-google-language - could you please check the package versions?
I found a mistake in my code. When I fix this, I successfully reproduce Table 12 but without template "A photo depicts ".
(DSG report)
(my result in PIL 8.4.0)
Thanks!
Hey there, thank you for great work! It really inspires my work. I tried to reproduce Table 12, specifically CLIPScore.
I found in CLIPScore repo, some packages (such as Pillow 8.4 vs 9.4 / torch 1.7 vs 2.0 / numpy 1.20.0 or higher) returns different value, subsequently return different correlation value. Also, clipscore employs prefix
A photo depicts
. However, I found TIFAv1 CLIPScore corresponds with without any prefix. When I reproduce with TIFA160 and it returns slightly different values (DSG report) 0.276 / 0.191It would be really helpful if you provide package dependancy you used for the paper and whether you used prefix when calculating CLIPScore. Thanks!