zhixiongz / CLIP4CMR

A Comprehensive Empirical Study of Vision-Language Pre-trained Model for Supervised Cross-Modal Retrieval
41 stars 3 forks source link

Different Results than reported #1

Closed shivangibithel closed 2 years ago

shivangibithel commented 2 years ago

Hey @zhixiongz

I like your work and tried to reproduce the numbers on the Nuswide dataset for the different losses. In line 169 in main.py you used betas as a variable and haven't initialized it anywhere. I kept the default values of beta for adam optimizer to generate results. My results are different and lower from the values reported in Table2.

https://github.com/zhixiongz/CLIP4CMR/blob/ca75cfccc3486263b0d5f116cf546798cf31f572/main.py#L169

Please update the code for the values that you used to generate the numbers in Table 2.

Thanks and Regards Shivangi

zhixiongz commented 2 years ago

Hey @shivangibithel

Thanks a lot for your suggestion, we have updated the default values in main.py. The results reported in Table 2 on the NUS-WIDE dataset only use 1000 test samples, as with previous baseline methods, I guess you probably used the full 2000 test samples in test.pkl. Since mAP is evaluated on the entire test set, more test samples lead to poorer results. Here we use the common dataset partition provided by Corr-AE.

We have updated the test data of NUS-WIDE in the cloud storage, please see https://pan.cstcloud.cn/s/JqKbqGfTRs