hellomuffin / exif-as-language

official repo for the paper "EXIF as Language: Learning Cross-Modal Associations Between Images and Camera Metadata"
MIT License
39 stars 4 forks source link

Could you please provide the model of training completion? #4

Closed Admgan closed 1 year ago

Admgan commented 1 year ago

Due to limited laboratory conditions, could you please provide a trained model for testing? I would appreciate it and promise to use it only for research.

etaisella commented 1 year ago

+1 :)

hellomuffin commented 1 year ago

Thank you for your interest in our work! I am sorry about didn’t check the repo on vacation and missed this issue. I have released a version of pre-trained weight. Let me know if you have further questions.

hellomuffin commented 1 year ago

Hi, recently we found that because the checkpoint we uploaded is not a full model( trained for 48k steps, while as said in the paper the full model is trained for ~73k steps), the performance is a bit worse than the data in paper. Sadly, the original full model and data are accidentally auto-cleaned by cluster due to long-time no access. To make up, we temporarily uploaded another full model that is trained for 75k steps in another 1.5M random sample of yfcc100m dataset. Qualitatively its performance is extremely similar to original full model, quantitatively there is a little difference, perhaps due to variance of training data.

Specifically, the performance for this version of full model in Columbia and DSO is as follows: Columbia: mAP: 0.93 cIoU: 0.88 DSO: mAP: 0.65 cIoU: 0.80

Thanks again for raising this issue. We are looking into regenerating the deleted data and training the model for full length to replicate the results specified in the paper. We will get back to you soon. Sorry for the inconvenience.