Thanks for releasing the code for this amazing work! After I carefully read through the paper and supplementary materials, I have 2 questions.
I found that you fine-tuned inception-V3 on VggSound data in order to calculate the FID and IS score. Could you please release the fine-tuned checkpoint so that I could use it to compare?
Which specific checkpoint did you use for CLIP to calculate those CLIP R@1, 5 metrics?
Thanks in advance for guidance and clarifications!
Hi,
Thanks for releasing the code for this amazing work! After I carefully read through the paper and supplementary materials, I have 2 questions.
I found that you fine-tuned inception-V3 on VggSound data in order to calculate the FID and IS score. Could you please release the fine-tuned checkpoint so that I could use it to compare?
Which specific checkpoint did you use for CLIP to calculate those CLIP R@1, 5 metrics?
Thanks in advance for guidance and clarifications!