TencentARC / ST-LLM

[ECCV 2024🔥] Official implementation of the paper "ST-LLM: Large Language Models Are Effective Temporal Learners"
Apache License 2.0
80 stars 2 forks source link

HF weights #7

Closed mutonix closed 2 months ago

mutonix commented 2 months ago

Many thanks to the great contribution! What is the difference between the QA_weight and Conversation_weight in the huggingface repo?

farewellthree commented 2 months ago

QA_weight uses almost the same datasets as VideoChat2. Conversation_weight enables the global-local module and is mainly trained on caption datasets, which works better on conversation and long video input.

mutonix commented 2 months ago

Got it! Thanks for your reply :)