Open mttbernardini opened 3 weeks ago
Adding to this:
import numpy as np
from pedalboard import AudioFile
with AudioFile("test.wav", "w", 48000, 1) as f:
f.write(np.random.rand(1, 1))
Also fails for the same ChannelLayout detection. Of course this is an unrealistic minimal example, one has to consider instead a process loop where a chunk gets written to file: in that scenario, this bug can happen if the processing/stream size combination causes an iteration with a chunk of only one sample.
While in the 2x2 scenario the workaround would be to write one sample at a time (i.e. two buffers of shape (2, 1)), in the 1x1 no workaround is possible (writing an additional sample may not be acceptable if client code requires invariance of number of samples between pre-process and post-process).
Thanks for the super-detailed and well written bug report @mttbernardini!
You're totally right: the auto-detection of channel layout is a bit of a pain. I see a couple fixes that could be made, like you suggest:
(1, n)
or (n, 1)
cases, auto-detection of layout could be more informed by the channel count of the AudioFile
object being used. This is a bug.For the square input case (i.e.: (x, x)
) things are a little more tricky. While we could add an explicit ChannelLayout
parameter as you suggest, I pretty strongly want to avoid adding additional parameters if possible.
I totally forgot that I'd done this, but there's an example of how we can handle this in the Pedalboard codebase already: StreamResampler
caches the last-detected channel layout and uses the last-provided layout if layout detection fails for a provided input. I think this approach would handle most real-world use cases, although it would still not solve the case in which the first buffer is square in shape.
Thank you for your reply @psobot.
I like the solution you proposed on point (2) and I believe having a scenario in which a Plugin
or WritableAudioFile
ever processes/writes just 1 or 2 samples is quite unlikely (it's more likely to process/writes 1 or 2 samples as a remainder after several big chunks). To be sure, we may want to specify this behaviour in the documentation and possibly make the raised RuntimeError
more detailed (or with a link to the docs if the description is too long).
I'm not sure I follow you for point (1) though. The behaviour of WritableAudioFile.write()
seems to match the behaviour of Plugin.process()
, i.e. raises a RuntimeError
when the input buffer is shaped (n, n)
. How would the channel count of the WritableAudioFile
help in detecting the channel layout? A stereo WritableAudioFile
that receives a (2, 2)
buffer would still fail. Rather, I believe your solution in point (2) can be adopted for WritableAudioFile
as well.
Let me know your thoughts and if you'd like me to implement this.
Problem
I'm developing a DSP graph using
pedalboard
. It basically consist of reading arbitrary audio files at configurable chunk size, processing them with aChain
and then writing back the output to file system. While writing unit tests I hit this puzzling behaviour that prevents me from moving forward.If any
Plugin.process(...)
implementation receives a buffer that has shape(2, 2)
, it will fail withRuntimeError: Unable to determine channel layout from shape!
This is not really an uncommon scenario, and can even be replicated with the first example provided in the Pedalboard website, by having a stereo
some-file.wav
at sample ratex
that is exactlyx+2
samples long. In general, if any chosen buffer reading size happens to cause a reminder of 2 samples over the audio file being read, the square matrix condition is met and no processing can be done in pedalboard°.Stack trace
Inspecting the
pedalboard
source code, I recreated the C++ call stack, which is not shown in Python (most recent call last):py::array_t<float> process(...)
defined inprocess.h
https://github.com/spotify/pedalboard/blob/273eb538193b6e3557c92745a6dbbf6f1d627481/pedalboard/process.h#L260-L261
py::array_t<float> processFloat32(...)
defined inprocess.h
https://github.com/spotify/pedalboard/blob/273eb538193b6e3557c92745a6dbbf6f1d627481/pedalboard/process.h#L166
template <typename T> ChannelLayout detectChannelLayout(...)
defined inBufferUtils.h
https://github.com/spotify/pedalboard/blob/273eb538193b6e3557c92745a6dbbf6f1d627481/pedalboard/BufferUtils.h#L44-L45
Specifically, the last line is hit whenever
Plugin.process(...)
is called with a buffer that has shape(n, n)
(i.e. a square matrix).Proposed solution
Is it possible to change the function signature of
Plugin.process(...)
so that it can accept an explicitChannelLayout
? The semantics would be:None
(default when not provided), then autodetect (existing behaviour, backwards compatible)Since
ReadableAudioFile.read()
reliably returns non-interleaved buffers, then to me it makes sense to enforce the same layout when processing, leaving auto-detection for other scenarios / backwards compatibility?I can help implementing this if needed 👍
° the workaround would be to check the buffer and if squared then append an extra padding sample to make it rectangular, but this seems less than ideal and unnecessarily complicates client code.