Closed overhacked closed 4 years ago
I discovered that I was abusing Converter
by trying to resample each chunk streamed from cpal
separately, when I needed to resample the entire recording (or use a streaming resampler that keeps state).
State of what?
The state of the interpolation. The output of the algorithm is affected by the previous samples, so if I feed it just a short array of samples each invocation, then I’m restarting the resampling algorithm each time. Streaming resample algorithms basically are a reduce function that keeps a moving average (not really the mean, but the output of the chosen resample algorithm, e.g. linear or bicubic) and uses it to calculate the resampled output of the next set of samples.
So basically, they work like a sliding window?
A streaming resampler does, I think, but I’m no authority.
I'm downsampling a 48KHz sample buffer, generated from a hardware input using the
cpal
crate, to 16KHz for speech-to-text viadeepspeech-rs
, and I'm getting a lot of noise in the output audio (written out to .WAV viahound
). Noisy recording attached: recorded.wav.gz.I've tested to make sure that the noise is introduced by
Converter
. If I save out the source audio at 48KHz/f32 or non-resampled at 48KHz/i16 there are no noise artifacts, but 16KHz/f32 or 16KHz/i16 sounds like there's a lot of rounding errors (?). I don't have a lot of experience with DSP, so please be patient with my imprecise explanations.I have resampled the source audio using Audacity and sox from 48KHz to 16KHz, and neither introduce noise, just the expected decrease in fidelity with reduced sample rate.
Here is a short section of the overall code. It's the callback function given to
cpal::device.build_input_stream()
that does all the work: