-
您好,感谢您的工作和开源代码👍👍👍!我想请教一下:
- 在训练Osprey的整个过程中是否有使用到一些视频领域的多模态数据集呢?比如MSR-VTT, MSVD和VATEX.
- 我看您使用到了COCO, RefCOCO等数据集,他们是不是包含了MS-COCO呢?好像MS-COCO是COCO的子集😀
感激不尽!💐💐💐
-
In the paper, "So we assess the models previously trained on MSR-VTT using the MSVD test set" refers to training with the entire data set of MSRVTT, and testing the model with the test set of MSVD(670…
-
Hello,
I wonder if you don't mind sharing your pre-trained model for those who are interested in trying your system without having to go through the training ?
Thanks,
Moustafa
-
Hello, wonderful project!. Here I wonder how to finetune the pre-trained models on downstream video-text retrieval datasets like MSR-VTT, LSMDC, and MSVD? I notice that the script for zero-shot retrie…
-
I guess it's the regularization hyperparameter of $Loss_{SPARSE}$ , *i.e.* the $\lambda$ in your paper. In the appendix, it seems like for MSR-VTT, the model performs best when $\lambda$ = 5. But why …
-
Congrats! It's a nice work for zero-shot captioning.
In the paper, zero-shot video captioning results on MSR-VTT, Activity-Net, etc. have been reported. But from the this repo, I couldn't find codes…
-
-
Can you give me a link to obtain MSRVTT traininglabel.json in your code ?
-
您好,您在这个repo的首页提到了用finetuned CLIP提取视频特征,finetune时候用的是CLIP4CLIP的方式,请问这个finetuned CLIP checkpoint可以提供一下吗?
谢谢!
-
||link|
|----|---|
|paper| [HowTo100M: Learning a Text-Video Embedding by Watching Hundred Million Narrated Video Clips](https://openaccess.thecvf.com/content_ICCV_2019/papers/Miech_HowTo100M_Learni…