dvlab-research / LLaMA-VID

Official Implementation for LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models
Apache License 2.0
622 stars 39 forks source link

unable to get results when evaluating on msvd-qa benchmark #96

Closed irisgong1020 closed 23 hours ago