dvlab-research / LLaMA-VID

LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models (ECCV 2024)
Apache License 2.0
693 stars 43 forks source link

abnormal outputs for llama-vid-7b-full-224-video-fps-1 ckpt #68

Open YulongBonjour opened 7 months ago

YulongBonjour commented 7 months ago

Dear authors,

Thank you for your contributions to video understanding. I have encountered some issues while running your codes on V100. The outputs from your llama-vid-7b-full-224-video-fps-1 checkpoint are always showing "222222222222...". I am wondering whether the ckpt is correct? I didn't do any changes on your codes. And I encounter this issue when launching activitynet.eval.sh. Hope to get your reply as soon as possible. Thanks!

yanwei-li commented 6 months ago

Hi, it should not print the output in activitynet_eval.py. In some case, it may have repeat outcomes, you can change the model generate params to avoid such cases like temperature or repetition penalty.