beetbox / audioread

cross-library (GStreamer + Core Audio + MAD + FFmpeg) audio decoding for Python
MIT License
481 stars 108 forks source link

Caching available backends list? #125

Closed bmcfee closed 1 year ago

bmcfee commented 1 year ago

Each time available_backends is called, it does a full sweep of all the import attempts and builds up a fresh list of backends. If all the backends are present, this should go reasonably quickly after the first call. However, if backends are missing, then each import attempt will continue to take time searching paths.

This ended up causing a performance regression over in librosa, where our most recent release supports working with a pre-constructed audioread object instead of a file path. Because the audioread backends do not inherit from a shared base class, there's no easy way to detect this case short of checking the provided object against available_backends(). This means that each call to our loader includes a call to available_backends, which is adding up.

I see two potential solutions here:

  1. Cache the backend list on the caller side after the first call.
  2. Cache the backend list on the audioread side.

Option (1) obviously would work, but I thought I'd float the idea of implementing the cache over here instead. I sketched it out on a local copy and it brings calls down from about 50ms to 3µs on my dev machine.

If you're open to the idea, I'll put up a PR.

bmcfee commented 1 year ago

Quick followup with benchmarks on my dev branch with the cache implemented:

In [10]: %timeit audioread.available_backends(flush_cache=False)
117 ns ± 2.09 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [11]: %timeit audioread.available_backends(flush_cache=True)
93.7 ms ± 259 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
sampsyo commented 1 year ago

Awesome; thanks for digging in and finding a good solution. FWIW, I'd also be happy to make all the audio-file classes inherit from a common base class, if that would simplify this check on librosa's end.

bmcfee commented 1 year ago

I'd also be happy to make all the audio-file classes inherit from a common base class, if that would simplify this check on librosa's end.

It would, and it's good style too. :grin: But I think the caching issue is still independently useful.

sampsyo commented 1 year ago

Just released version 3.0.0 with these changes.