-
I want to use this code base to test the performancc ALPRO on MSRVTTT, but I did not find the tutorial of how to process video and do a text to video retrieval.
-
I just used an A800 and changed the batch size to 32. The other parameters are consistent with the appendix of the paper. Why can I only achieve 53%
-
Hi, I found MSVD-QA json files in previous issues, but the msvd_qa_answer_list.json seems to be missing.
Could you please provide it? Thanks!
-
When I read many articles about VFM, I often find that methods incorporating the audio modality tend to perform better than those using only video and text. Could you please tell me if the audio modal…
-
Thanks for the great work.
I evaluate the zero-shot performance of the 25M pre-trained ckpt on the DiDeMo dataset, my command is
```
export VL_DATA_DIR=/home/renshuhuai/VindLU/
export VL_EXP_DIR…
-
Hi,
I want to train the model using the MSR-VTT dataset. And it tells me that I need a pkl file but I can only find the mp4 and txt files. So how can I tranfer them to or maybe to find the pkl file.
-
Thanks for your work!
Could you upload the model's pretrained checkpoint file?
I want to test with the weights file to caption video input.
Thank you
-
您好,您在这个repo的首页提到了用finetuned CLIP提取视频特征,finetune时候用的是CLIP4CLIP的方式,请问这个finetuned CLIP checkpoint可以提供一下吗?
谢谢!
-
Hi! I'm trying to pretrain VindLU using 5M data, can you provide the pretraining logs for reference? Thanks!
-
Thanks for sharing your code. Is it normal to get R1=30 with train_titles.py? After running the score fusion, the title matrix does not improve the video matrix.