Closed mutonix closed 2 months ago
QA_weight uses almost the same datasets as VideoChat2. Conversation_weight enables the global-local module and is mainly trained on caption datasets, which works better on conversation and long video input.
Got it! Thanks for your reply :)
Many thanks to the great contribution! What is the difference between the QA_weight and Conversation_weight in the huggingface repo?