pytorch / audio

Data manipulation and transformation for audio signal processing, powered by PyTorch
https://pytorch.org/audio
BSD 2-Clause "Simplified" License
2.5k stars 644 forks source link

torchaudio.compliance.kaldi.fbank #1245

Open qmpzzpmq opened 3 years ago

qmpzzpmq commented 3 years ago

please support batch kaldi fbank computation/ "waveform (Tensor) – Tensor of audio of size (c, n) where c is in the range [0,2)" right now only single utt compute is support

mthrok commented 3 years ago

Thanks for the feedback. Surely, this is very important and will try to address this. We are thinking to apply tweak on torchaudio.compliance.kaldi. We do not have an immediate action plan at the moment, but we will try to come back to this as soon as possible.

Oktai15 commented 3 years ago

@qmpzzpmq you can use torchaudio.transforms.MelSpectrogram as alternative

qmpzzpmq commented 3 years ago

@Oktai15 hi, I just wondering if the result of theme are same? From description, these result looks difference.

Oktai15 commented 3 years ago

@qmpzzpmq for example, check this issue: https://github.com/pytorch/audio/issues/157#issuecomment-513872666

qmpzzpmq commented 3 years ago

@Oktai15 thanks for your example, I will test them for same result. but it looks, still some parames to be check.

haha010508 commented 1 year ago

i found the FBank can not run in async mode, who can fix this? thanks!