muzairkhattak / ViFi-CLIP

[CVPR 2023] Official repository of paper titled "Fine-tuned CLIP models are efficient video learners".
https://muzairkhattak.github.io/ViFi-CLIP/
MIT License
248 stars 18 forks source link

Inquiry about Finetuning CLIP Model on Kinetics Dataset Duration #14

Closed SHIBOYA closed 10 months ago

SHIBOYA commented 11 months ago

I have been closely following your work and am particularly interested in the methodologies you've employed in your recent research.

I am writing to inquire about some specifics regarding the finetuning process of the CLIP model on the Kinetics dataset, which I noticed was a part of your study. Specifically, I am keen to understand the approximate duration required for this fine-tuning process.

Additionally, any tips or recommendations you might have based on your experience with this process would be greatly appreciated.

Thank you very much for your time and assistance. I look forward to your response and any insights you can provide.

Best regards!

muzairkhattak commented 10 months ago

Hi @SHIBOYA,

Thank you for showing interest in ViFi-CLIP!

For training ViFi-CLIP B/16 on the K400 dataset, it takes around 22 hours using 8 A100 GPUs.

Thank you!