bytedance / SALMONN

SALMONN: Speech Audio Language Music Open Neural Network
https://bytedance.github.io/SALMONN/
Apache License 2.0
1.07k stars 84 forks source link

video-salmonn vicuna question #86

Open HeChengHui opened 4 days ago

HeChengHui commented 4 days ago

@BriansIDP Thank you for your work.

what is the vram requirement to run inference? i am having OOM using lmsys/vicuna-13b, but lmsys/vicuna-7b is giving me size mismatch error.
Or am i using the wrong model?

BriansIDP commented 4 days ago

Thank you for the question. Video-SALMONN is trained with vicuna-13b so the input dimension for the 7b model would not match the Q-Former output of video-SALMONN. It would be helpful to try quantization (with a bit of performance loss).

HeChengHui commented 4 days ago

@BriansIDP does that mean i can use something like TheBloke/vicuna-13B-v1.5-16K-AWQ by just setting it in the config?