Add dataloading code - Githubissues

Pipeline is to save the data in two places: an lm_dataformat archive for the text, and a directory of .pt Pytorch files for the spectrogram tensor, with shape [items in file, Mel bins, frames]. So, for a file of 1000 examples, with an 80 dimensional Mel spectrogram that's 400 frames long, the tensor would be of shape [1000, 80, 400].

cfoster0 / CLAP

Add dataloading code #21