TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
https://arxiv.org/abs/2305.18500
MIT License
241 stars 17 forks source link

Nice work! #3

Open jetwu-create opened 1 year ago

jetwu-create commented 1 year ago

It's a nice work. when will the code be released?

changeAI commented 1 year ago

Good job! not just model

ymartin-mw commented 1 year ago

any update on schedule for releasing code and model? i would like to get an embedding of video and text that live in the same space for textual search over videos

abhimanyu891998 commented 1 year ago

any updates on the code release and model checkpoints? Thanks!

TXH-mercury commented 1 year ago

@abhimanyu891998 @changeAI @ymartin-mw @jetwu-create Thank your recognition to our work. Code and model are planned to be released in the middle of November.

Jxu-Thu commented 11 months ago

Great work! What is the plan of releasing the dataset?

zzchust commented 9 months ago

waiting.

TXH-mercury commented 9 months ago

@zzchust @abhimanyu891998 @Jxu-Thu @changeAI Sorry for the late update! After overcoming all restrictions, VAST are successfully released now! Please leave message if you meet any problems when using this codebase. Thank you