librosa / librosa

Python library for audio and music analysis
https://librosa.org/
ISC License
6.87k stars 944 forks source link

Stream resampling #1518

Open bmcfee opened 2 years ago

bmcfee commented 2 years ago

Is your feature request related to a problem? Please describe.

We currently have two interfaces to ingesting audio: load and stream. Aside from the obvious functional differences between the two, a key difference is that the stream interface does not support resample-on-load, while load has this enabled by default. This is primarily because the resampling libraries we support (scipy, resampy, libsamplerate, soxr) do not all support stream-based processing, though the latter two do.

It would be great if we could smooth this gap over, though it does introduce some technical hurdles.

Describe the solution you'd like

For a restricted class of res_type values (namely, samplerate and soxr resamplers), we could in principle support resample-on-load within stream. We could think of this as having a second layer of processing in between the soundfile chunk generator and our block generator which handles the conversion.

From an interface perspective, this ought to be relatively straightforward, with the caveat that block and frame parameters should be understood to operate at the target sample rate, not the native rate.

This does raise a question of whether we could guarantee numerical equivalence (assuming the same resamplers are used) between load and stream. Getting this right might require some modification of the chunking parameters in soundfile to ensure that the resampler has enough future context to perform the interpolation. That in turn will depend on the specifics of each resampler.

Describe alternatives you've considered :shrug:

Additional context

If at all possible, I'd like to strive for API consistency here. This will require some hard decisions:

bmcfee commented 1 year ago

Blocked by #1556