Hello, wonderful project!. Here I wonder how to finetune the pre-trained models on downstream video-text retrieval datasets like MSR-VTT, LSMDC, and MSVD? I notice that the script for zero-shot retrieval has been provided, but there is no script about how to finetune on retrieval datasets.
Hello, wonderful project!. Here I wonder how to finetune the pre-trained models on downstream video-text retrieval datasets like MSR-VTT, LSMDC, and MSVD? I notice that the script for zero-shot retrieval has been provided, but there is no script about how to finetune on retrieval datasets.