cfoster0 / CLAP

Contrastive Language-Audio Pretraining
BSD 3-Clause "New" or "Revised" License
87 stars 4 forks source link

Add dataloading code #21

Closed cfoster0 closed 3 years ago

cfoster0 commented 3 years ago

Pipeline is to save the data in two places: an lm_dataformat archive for the text, and a directory of .pt Pytorch files for the spectrogram tensor, with shape [items in file, Mel bins, frames]. So, for a file of 1000 examples, with an 80 dimensional Mel spectrogram that's 400 frames long, the tensor would be of shape [1000, 80, 400].