TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
https://arxiv.org/abs/2305.18500
MIT License
243 stars 17 forks source link

Can you share the checkpoint of the finetune models? #25

Open wonzin opened 5 months ago

wonzin commented 5 months ago

I have finetuned the VAST for QA tasks, but I’m not achieving the level of accuracy suggested in your paper. Could you possibly share the checkpoint of the fine-tuned models, particularly for the QA tasks? I would appreciate your assistance with this.