jzlianglu / pykaldi2

Yet another speech toolkit based on Kaldi and PyTorch
MIT License
173 stars 33 forks source link

Convolve vs. BlockConvolve for RIR Augmentation #9

Open PCerles opened 4 years ago

PCerles commented 4 years ago

Hi, have been using your Simulator functionality and found it quite useful. However, the augmented data I'm obtaining from it has a ton of reverb (more than I'm expecting). Still diagnosing the problem, but is there any reason why this repo is using the equivalent of

FFTbasedConvolveSignals https://github.com/kaldi-asr/kaldi/blob/master/src/feat/signal.cc#L50

as opposed to

FFTbasedBlockConvolveSignals https://github.com/kaldi-asr/kaldi/blob/master/src/feat/signal.cc#L77

Kaldi does reverb by using the second https://github.com/kaldi-asr/kaldi/blob/master/src/featbin/wav-reverberate.cc#L96. Thanks!

singaxiong commented 4 years ago

Hi, thanks for your interest in the tool! I think the reason for excessive amount of reverb is probably due to that your RIR file has long RT60 time. The convolution is basically very straightforward, whether it is utterances based or block based.

By the way, do you know why Kaldi uses blockwise convolution? Any computational advantage?