-
why https://github.com/PKU-YuanGroup/Video-LLaVA/blob/main/llava/eval/video/run_inference_video_qa.py#L122 this line is commented out?
-
I had some trouble reproducing InstructBlip model results on the msvd_qa and msrvtt_qa datasets. Could you please tell me what prompt template and hyperparameters were used for these datasets ? It wou…
-
Thank you for your excellent work! I'd like to express my gratitude for your efforts in contributing to open-source data and models. I encountered a minor issue when loading a dataset from Hugging Fac…
-
Hello Team,
Thank you for your amazing work on this model. I was able to reproduce your remarkable results. I am looking to contribute and develop downstream inference using faiss but I am running …
-
I read in the readme file, paligemma can captioning a short video, anyone can guide me to do that?
Does it extract every frames on the video? Or does the paligemma tokenizer directly support video…
-
Excuse me,thank you for your excellent project. I have successfully reproduced it on two datasets, MSRVTT and MSVD.
But when I reproduced it on the LSMDC dataset, the performance was much worse th…
-
In the paper, "So we assess the models previously trained on MSR-VTT using the MSVD test set" refers to training with the entire data set of MSRVTT, and testing the model with the test set of MSVD(670…
-
能否分享不需要建立docker的运行指令呢?在单卡服务器上运行
-
Thank you for your great work!
-
Hello, I admit that this is a good job.
However, in the code, you set batch_size=256, but the paper states that it is 128 ( Maybe the version of the paper I downloaded is wrong? I downloaded it from…