Open iver56 opened 4 years ago
There's also torchaudio's resample, should we should between both?
I'm not so fond of torchaudio's resample function, because it seems to be much slower than julius. Here's the result of a crude benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:
librosa/resampy kaiser_fast: 4.23 s
librosa/resampy kaiser_best: 15.12 s
torchaudio kaldi-compliant LPF width=2: 22.97 s
torchaudio kaldi-compliant LPF width=6: 23.56 s
torchaudio kaldi-compliant LPF width=10: 23.99 s
julius cpu 64 zeros: 0.195 s
julius cpu 16 zeros: 0.176 s
Ok, it's pretty clear that Julius is better, let's stick with it !
benchmark that resamples some audio from 44100 hz to 48000 hz on CPU:
What other sample rate conversions did you try? Did you compile the Resample transform with torch.jit.script?
In my crude benchmark, I ran it simply like this:
import torch
from torchaudio.compliance.kaldi import resample_waveform
for lowpass_filter_width in (2, 6, 10):
with timer("pytorch-audio kaldi-compliant LPF width={}".format(lowpass_filter_width)):
pytorch_kaldi_compliant = (
resample_waveform(
torch.from_numpy(samples).unsqueeze(0),
orig_freq=sample_rate,
new_freq=HIGH_SAMPLE_RATE,
lowpass_filter_width=lowpass_filter_width,
)
.squeeze()
.numpy()
)
I didn't try other sample rate conversions
I've got a notebook to benchmark different methods of resampling. There are some conversions that take longer, I think, due to there being a gcd between input and output sample rates. It would be good to add julius to that list and compare results when resampling is done in batches.
https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445
In your benchmark you for example load in and out of numpy which can take time.
Yes, that would be interesting.
Re numpy: Yes, but I did the numpy conversion in the julius benchmark as well. Pytorch tensors share memory with numpy arrays when running on CPU, so the "conversion" should be quite fast.
I've added julius to the benchmark notebook. Seems that it does produce higher quality and does so faster most of the time. I did notice that it didn't output the same length of samples as was input to it so had to add a minor hack to solve that.
https://gist.github.com/mogwai/a5df03e89ab33bc0a5648965280d5445
Yes, I've been using fix_length from librosa to solve the length issue. (from librosa.util import fix_length
)
https://github.com/adefossez/julius