RenShuhuai-Andy / TimeChat

[CVPR 2024] TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
https://arxiv.org/abs/2312.02051
BSD 3-Clause "New" or "Revised" License
267 stars 23 forks source link

Discussion : Steps for swapping Llama 2 with Llama 3 #37

Closed rahulkrprajapati closed 2 months ago

rahulkrprajapati commented 3 months ago

Hi @RenShuhuai-Andy , Great work on the paper. I'd been wondering about the accuracy improvement for real world use cases. Would swapping Llama 2 with Llama 3 help with increasing the accuracy? And how can I go about executing and testing this out?

RenShuhuai-Andy commented 3 months ago

Hi, thanks for your interests.

I think replacing Llama 2 with Llama 3 needs re-conducting instruct tuning. You can try to replace the model path in https://github.com/RenShuhuai-Andy/TimeChat/blob/master/train_configs/stage2_finetune_time104k_valley72k.yaml#L12, https://github.com/RenShuhuai-Andy/TimeChat/blob/master/train_configs/stage2_finetune_time104k_valley72k.yaml#L54, and https://github.com/RenShuhuai-Andy/TimeChat/blob/master/train_configs/stage2_finetune_time104k_valley72k.yaml#L75 to the model path of Llama 3.