[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.83k
stars
263
forks
source link
change the frames and query_tokens size #128
Open
AllenFind opened 1 year ago
Hi,
Thanks a lot for your great work!
I am wondering if we can change the sampling frames and query_tokens size.