TXH-mercury / VAST

Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset
https://arxiv.org/abs/2305.18500
MIT License
241 stars 17 forks source link

labelling my own data use vast's captioner error? #14

Closed SixGoodX closed 8 months ago

SixGoodX commented 8 months ago

The format of the meta.json is as follows:

8A8360EB4BF5C0AE1BE108F5131E0276