-
I found that the moment_start_frame in annontaions doesn't match the original MP4 video, how do you calculate that?
-
Hi, thanks for the great work! I really love this paper and really happy to try some VideoQA examples on this model.
In this process, I faced some questions. It will be really nice of you to share…
-
Hello, thank you very much for being able to share your work,!I've run into a couple of problems in trying to reproduce your work:
1. running main.py, videoqa.py gives an error:“ERROR:root:No token f…
-
Thanks for your contribution!
I tried to reproduce your result (Zero-shot VideoQA on MSVD dataset) with the pretrained weight [https://huggingface.co/LanguageBind/Video-LLaVA-7B/tree/main](https://…
-
Thanks for your contribution!
I tried to reproduce your result (Zero-shot VideoQA on MSVD dataset) with the given pretrained weights. (EVA-G & LLaVA1.5-VideoChatGPT-Instruct 7B).
But the result …
-
In the stage3 script, why are these two parameters(model_name_or_path, pretrain_mm_mlp_adapter) present at the same time? The former should include the latter, right?
-
Hello,
I saw that other issues mentioned the need to submit predictions for all frames during the testing phase. However, I encountered an error after submitting it. Can you help me solve it?
…
-
Dear;
Thanks for your interesting tasks. But I have some issues for the Grounded videoQA task. I noticed that the results for the grounded videoQA are the subset of the SOT task. But in the SOT c…
-
Hi, thank you so much for the great works! I have questions about sampled frame number, in the paper mentioned
> During inference, we uniformly sample 6 frames with center crop.
I am keen to kn…
-
This is what I got trying finetuning long video using stage_3_full_v7b_224_longvid.sh with two v100(32G), CPU(125G)
ERRROR INFO:
You should probably TRAIN this model on a down-stream task to be …