TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
https://arxiv.org/abs/2305.18500
MIT License
238 stars 17 forks source link

How can I fine-tune a model for a downstream task? #12

Open echo233 opened 8 months ago

echo233 commented 8 months ago

I now need to validate the performance on the MSRVTT dataset. How can this be implemented? Could you provide a corresponding tutorial?

Fanzy27 commented 7 months ago

README.md