LAION-AI / audio-dataset

Audio Dataset for training CLAP and other models
632 stars 53 forks source link

When possible prefer saving parquet with url inside #13

Open rom1504 opened 2 years ago

rom1504 commented 2 years ago

Similarly to image datasets, it's better to first save a url + metadata file as parquet That can be distributed without copyright issue

Then a tool like img2dataset can handle the download

Let's add that in the readme here