Bad performance of Charades

RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding

BSD 3-Clause "New" or "Revised" License

267 stars 23 forks source link

Hi, thanks for your interest.

Our released ckpt is different from the version used in the paper. The released ckpt was trained after cleaning the code and fixing a minor bug in QuerYD instructions data (some videos have the same start and end timestamps in the raw annotations file, so we only use one timestamp in the revision). In our evaluation, the performance of the released ckpt on YouCook2 is higher than that in the paper, while the performance on Charades-STS & QVHighlight is lower. We also note that the output generated by LLM is different each time, which may cause fluctuations in the evaluation results.

We have uploaded the ckpt used in our paper, please refer to https://huggingface.co/ShuhuaiRen/TimeChat-7b-paper. With this ckpt, I believe you can reproduce the results in our paper.

RenShuhuai-Andy / TimeChat

Bad performance of Charades #14