-
Sorry to disturb you. When I reproduce the results on LSMDC dataset, I get worse results than those in paper. In the meanP experiment, the meanR is always around 200, rather than around 60. The log i…
-
Excuse me,thank you for your excellent project. I have successfully reproduced it on two datasets, MSRVTT and MSVD.
But when I reproduced it on the LSMDC dataset, the performance was much worse th…
-
So that it can not be easily missleading model using llava original dataset.
Meanwhile, it looks like the images are missing..
"id": "vcr-52941",
"image": "vcr1images/lsmdc_30…
-
Hi, great work and thanks for releasing the code. In Table 10 of your InternVideo2 paper, you reported the results of finetuning video retrieval in both T2V and V2T on MSR-VTT, LSMDC, DiDeMo, MSVD, Ac…
-
Hello!
I am a grad student from CMU working on the LSMDC dataset for tasks such as movie description and movie fill in the blank. I found your work really interesting and wanted to run predictions …
-
Dear Authors,
I am trying to reproduce Zeroshot performance with the checkpoint [ViCLIP-L-14 InternVid-10M-FLT ](https://huggingface.co/OpenGVLab/ViCLIP).
However, the performance is different from …
-
Hello, I want to reproduce your code results, but many imported packages are missing in the code header, such as YouTube_ dataloader、youcook_ dataloader、msrvtt_ Dataloader、lsmdc_dataloader、model_kmean…
-
Hello, wonderful project!. Here I wonder how to finetune the pre-trained models on downstream video-text retrieval datasets like MSR-VTT, LSMDC, and MSVD? I notice that the script for zero-shot retrie…
-
In the data folder there are steps missing for -
"I just want to caption a couple of videos".
Please complete the steps whenever possible.
-
Hi, thank you for sharing the model.
For the evaluation the line suggest to use 6144 for the embedding:
python eval.py --eval_msrvtt=1 --eval_youcook=1 --eval_lsmdc=1 --num_thread_reader=8 --embd_di…