-
Hi!
Thanks for your excellent work!
I wonder know how to reproduce the result on STAR and MSRVTT dataset, it seems that they don't have id2word.txt, word2id.txt and also the intermediate results, ca…
-
train_file: 'datasets/annotations_all/msrvtt_caption/train.jsonl'
test_file: 'datasets/annotations_all/msrvtt_caption/test.jsonl'
video_root: "datasets/MSRVTT/data/MSRVTT/videos/all"
i get YouTub…
-
Hello, wonderful project!. Here I wonder how to finetune the pre-trained models on downstream video-text retrieval datasets like MSR-VTT, LSMDC, and MSVD? I notice that the script for zero-shot retrie…
-
Hi,
Congratulations on the great work!
Would you mind providing a pointer to where did you find the dataset split for the captioning datasets, as it seems they are not always consistent with the…
-
Thank you for sharing your work. Sincerely want a clarification about mask radio. I refer the closed issue "About 'compute_trick_metric'", adjusting seed to 42 and mask ratio to 0.5 for msrvtt, but R1…
-
Hi!, I have a problem and need your help. You validate your model on Video QA datasets such as MSVD-QA, which do not have audio clip, how do you deal with this?
-
I now need to validate the performance on the MSRVTT dataset. How can this be implemented? Could you provide a corresponding tutorial?
-
Hellow , nice job !
I can not reproduce the MSRVTT finetuned model,and I set each args as the [log](https://pjlab-gvm-data.oss-cn-shanghai.aliyuncs.com/internvideo/retrieval/msrvtt/kc4_finetune_1e-32…
-
Thank you for your great open-source code, I am excited for the outstanding zero-shot performance over video-text retrieval. Can you share the inference code for video-text retrieval on MSRVTT, thank…
-
this is the results i've got on MSRVTT, which is really far worse than the paper results:
There must be something wrong in my test process and here's how i get this:
1. I've tried to run the text-…