The performance for video caption seems poor

Hi @Hyu-Zhang, Thanks for your interest in our captioning algorithm and sorry for your inconvenience. This issue seems to be duplicate as https://github.com/snap-research/Panda-70M/issues/12. I hyposize this happens because you are using different tokenizer or LLM model. Did you follow this guideline to prepare vicuna-7b-v0 weight? Basically, you need to first download the original weight and apply delta weights. Could you please check for that? You can also check some issues (like this one) in FastChat repo for reference!

snap-research / Panda-70M

The performance for video caption seems poor #50