LAION-AI / CLAP

Contrastive Language-Audio Pretraining
https://arxiv.org/abs/2211.06687
Creative Commons Zero v1.0 Universal
1.43k stars 137 forks source link

up,I want to know how to tar the dataset to prepare for finetuning?我想知道在finetuning是例如esc50数据集是怎么tar打包的?我用tar -cvf 1.tar fsx 打包之后的数据加载会出现确实’flac‘属性的问题 #87

Closed wanghua-lei closed 1 year ago

lukewys commented 1 year ago

Hi, you cannot use the tar command provided by the system. You need to make an order-preserving tar (that is, the corresponding text and json for each sample need to be nearby each other in the tar). You can refer to our code to make the tar: https://github.com/LAION-AI/audio-dataset/blob/main/utils/make_tar_utils.py, https://github.com/LAION-AI/audio-dataset/blob/main/utils/make_tar.py

wanghua-lei commented 1 year ago

thanks for ans I use command tar -cvf 1.tar fsx/* can solve the problem.