-
-
Hello,
Thanks for your great work!
We'd like to run zero-shot evaluation on msrvtt qa task. However, following the readme below (set zero-shot evaluation and prepare dataset), we still encounter th…
-
能否分享不需要建立docker的运行指令呢?在单卡服务器上运行
-
Thank you for your great open-source code, I am excited for the outstanding zero-shot performance over video-text retrieval. Can you share the inference code for video-text retrieval on MSRVTT, thank…
-
Hi, amazing work.
When you have the time, could you release the config file for finetuning on MSRVTT with Xretrieval.py Thanks.
-
Hi, I want to reproduce the results of MSRVTT dataset by training the model from scratch. Before training from scratch, I have reproduced the MSRVTT results using the officially released checkpoint (C…
-
why https://github.com/PKU-YuanGroup/Video-LLaVA/blob/main/llava/eval/video/run_inference_video_qa.py#L122 this line is commented out?
-
Hi! Thanks for the open-sourced code!
I wonder if you have conducted zero-shot experiments on MSRVTT or other downstream datasets. I get the following performance on standard text-to-video retrieva…
-
Thank you for your in interesting work and your shared code!
I'm very confused that whether the zero-shot performance on MSRVTT reported in [here](https://github.com/OpenGVLab/InternVideo/tree/main/D…
-
When I read many articles about VFM, I often find that methods incorporating the audio modality tend to perform better than those using only video and text. Could you please tell me if the audio modal…