DAMO-NLP-SG / Video-LLaMA

[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
BSD 3-Clause "New" or "Revised" License
2.7k stars 243 forks source link

Prompt #132

Open tobyperrett opened 9 months ago

tobyperrett commented 9 months ago

Hi. What is the full text that the model sees for the demo running in huggingface? Does it include any special tags/sys messages etc.? Thanks.