bastibe / python-soundfile

SoundFile is an audio library based on libsndfile, CFFI, and NumPy
BSD 3-Clause "New" or "Revised" License
718 stars 111 forks source link

Feature request: Support interleaved stereo data #407

Open mcclure opened 1 year ago

mcclure commented 1 year ago

Although python-soundfile is usually used with NumPy, it does support non-NumPy use. A great feature for non-numpy use would be an option to allow inputs to functions such as Soundfile.write() to be input in interleaved stereo format instead of multidimensional array format when the data is stereo. Often data is already in interleaved format for various reasons, and (I could be wrong about this) I think that generating a Python array by concatenating many values to a single list will generate less garbage than producing a 2xMany list of lists.

bastibe commented 1 year ago

There are the buffer_read, buffer_read_into, and buffer_write functions which do essentially that. And anyways, a C-order numpy array is internally in interleaved order as well.

mcclure commented 1 year ago

That's very useful, thanks. Can you clarify— https://python-soundfile.readthedocs.io/en/0.11.0/#soundfile.SoundFile.buffer_write What endianness properties should I expect buffer_read/buffer_write to have? Will it depend on format?

bastibe commented 1 year ago

Yes, it entirely depends on the format. Have a look at https://libsndfile.github.io/libsndfile/api.html#raw for more information. I'd expect this to only work reasonably for uncompressed PCM-like formats.

mgeier commented 1 year ago

The situation is admittedly a bit complicated, so it is easy to be confused.

The parametersformat and subtype (and endian as well) only matter for the storage of the audio data in the file.

The data that you are handling in your Python code is entirely independent of that. In the buffer_*() methods, the dtype argument specifies the data type that you are handling.

Here's a hopefully illustrative example:

>>> import soundfile as sf
>>> sf.write('myfile.aiff', [1.0], subtype='PCM_16', samplerate=48000)

This creates a 16-bit file containing the largest possible signal value. The given float value is automatically converted to a 16-bit integer.

>>> f = sf.SoundFile('myfile.aiff')
>>> bytes(f.buffer_read(dtype='int16'))
b'\xff\x7f'

As you can see, you still have to specify the dtype when reading the data. In this case, we are reading the same data type that's stored in the file, but that is not required.

And to come back to the question about endianness: If you look at the file contents (e.g. with xxd myfile.aiff), you see the contents at the very last two bytes: 7fff.

AIFF files are stored in big-endian format, but as you can see above, we got the bytes in little-endian format. We are getting native endianness. If you run this on a big-endian system, you should get b'\x7f\xff' (but I didn't try this because I don't have a big-endian system).

We can continue exploring:

>>> f.seek(0)
0
>>> bytes(f.buffer_read(dtype='int32'))
b'\x00\x00\xff\x7f'

We can read the value as 32-bit integer, even though it is stored as 16-bit integer in the file. And the important thing is that libsndfile scales the value to be the largest 32-bit integer!

In summary:

What endianness properties should I expect buffer_read/buffer_write to have?

Native endianness.

Will it depend on format?

Nope.