-
Thanks for the great work. I have some questions about the BLIP feature extractor interface.
1. In the example code, you wrote
```
# torch.Size([1, 12, 768]), use features_multimodal[:,0,:] for m…
-
-
I tried running the [evaluation code](https://github.com/PKU-YuanGroup/Video-LLaVA/blob/main/scripts/v1_5/eval/eval_image_llavabench.sh) on [your model checkpoint](https://huggingface.co/LanguageBind/…
-
Hello, thanks for the great work.
I noticed in [MODEl-ZOO.md](https://github.com/OpenGVLab/InternVideo/blob/main/InternVideo2/multi_modality/MODEL_ZOO.md) that the performance of the InternVideo2-cli…
-
https://github.com/salesforce/LAVIS/blob/59273f651b9bffb193d1b12a235e909e9f826dda/lavis/models/blip2_models/blip2_qformer.py#L242
Hello,
I was going through the code in BLIP-2's repository and I…
-
Hi,
Congratulations on the great success of your wonderful work! I have several questions about ptp in terms of the pretraining/fintuning settings described in the paper. The questions are as fo…
-
Hi, guys! Thank you for the project a lot. But I have an issue with downloading pretrained models using download_models.sh. I've tied different networks, but it fails all the time. Do you have another…
-
Hi, I am going to reproduce the reported performance on MSVD dataset with CIDEr of 120.6, but there exists a gap. In my experiment, the first evaluation after the initialization is poor, the initializ…
-
![screenshot-20240705-114831](https://github.com/whwu95/Cap4Video/assets/43465723/599b43dd-0ebc-4d18-9581-0b0bd07e94c3)
请问一下MSRVTT_test_website_titles在哪里下载呀
-
Thank you for your excellent work! However, I noticed that there are missing JSON files in the MSRVTT and MSVD datasets that your code requires. Could you provide them?"