Pipeline is to save the data in two places: an lm_dataformat archive for the text, and a directory of .pt Pytorch files for the spectrogram tensor, with shape [items in file, Mel bins, frames]. So, for a file of 1000 examples, with an 80 dimensional Mel spectrogram that's 400 frames long, the tensor would be of shape [1000, 80, 400].
Pipeline is to save the data in two places: an lm_dataformat archive for the text, and a directory of .pt Pytorch files for the spectrogram tensor, with shape [items in file, Mel bins, frames]. So, for a file of 1000 examples, with an 80 dimensional Mel spectrogram that's 400 frames long, the tensor would be of shape [1000, 80, 400].