mravanelli / pytorch-kaldi

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.
2.37k stars 446 forks source link

Problem with loading rawwav as feature, it costs too much time #212

Closed liuzz13 closed 4 years ago

liuzz13 commented 4 years ago

Hi Recently, I did some experiments with pytorch-kaldi. I find that it cost too much time to load data when using raw wav as feature, even more than training time. Do we have a batter dataloader? Or how can I reduce the time to load data?

mravanelli commented 4 years ago

Yes, we are aware of it. That's because the toolkit is designed to read features in kaldi format rather than waveforms. For the speechbrain project we are developing a totally new dataloader that can read any kind of features (including waveforms) in a very very fast way.

Best,

Mirco

On Sun, 9 Feb 2020 at 08:37, liuzz13 notifications@github.com wrote:

Hi Recently, I did some experiments with pytorch-kaldi. I find that it cost too much time to load data when using raw wav as feature, even more than training time. Do we have a batter dataloader? Or how can I reduce the time to load data?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/mravanelli/pytorch-kaldi/issues/212?email_source=notifications&email_token=AEA2ZVXQ237AU7OB67ECWRLRCABLNA5CNFSM4KSBWZT2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IMCFAYQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEA2ZVSJDPVC5YH2GQ2SERTRCABLNANCNFSM4KSBWZTQ .

lezasantaizi commented 4 years ago

Hi Recently, I did some experiments with pytorch-kaldi. I find that it cost too much time to load data when using raw wav as feature, even more than training time. Do we have a batter dataloader? Or how can I reduce the time to load data?

transform wav to feature