spatialaudio / python-sounddevice

:sound: Play and Record Sound with Python :snake:
https://python-sounddevice.readthedocs.io/
MIT License
982 stars 145 forks source link

Why channels argument of playrec() determines the number of INPUT channels? #388

Open bobtak opened 2 years ago

bobtak commented 2 years ago

It should determine the number of OUTPUT channels.

I believe playrec() should get the number of input channels from data.shape, and the number of output from channels. Current specification makes it difficult, for example, to play 2ch and record 4ch simultaneously. it is very common use case among signal processing.

The API documentation of playrec says: channels (int, sometimes optional) – Number of input channels, see rec(). The number of output channels is obtained from data.shape.

But the rec() referred says: channels (int, optional) – Number of channels to record. Not needed if mapping or out is given. The default value can be changed with default.channels.

On the other hand, play() doesn't have channels in its arguments.

These also show that the argument channels of playrec() should determine the number of output channels.

HaHeho commented 2 years ago

I think assuming a 1:1 mapping from given data to output channels is sensible for the simple convenience these functions are supposed to provide.

If the data is mono then the playrec(output_mapping=...) can distribute it to multiple channels, similar to play(mapping=...).

Otherwise, if your played data is not mono but also not equal to the desired number of output channels, you could arrange the playback file beforehand with the correct channel mapping baked in.

mgeier commented 2 years ago

I think the problem is that the words "input" and "output" are not clearly defined.

I'm using them in the context of the audio hardware (this thing is called sounddevice after all!), where "input" is where a signal comes in from the outside (e.g. via a microphone) and the "output" is where the signal leaves, e.g. towards loudspeakers.

With this meaning, I think the quoted text of the documentation makes sense.

However, if you look at it from the PoV of the sd.playrec() function, it's reversed, you might think of "input" as the array that is supposed to be played back, and the "output" of the function is the recorded signal (which came in from the microphone or whatever).

With this meaning the original question of @bobtak makes sense.

@bobtak Which meaning of the words are you using?

To add to the confusion, there is the out parameter, which is named like that because it is the "output" of the function, but it will contain data from the "input" of the audio device.

This should probably be clarified in the documentation, I'd be happy for any suggestions!

Current specification makes it difficult, for example, to play 2ch and record 4ch simultaneously.

I think this isn't difficult:

import numpy as np
import sounddevice as sd

left = 0, 0, 0, 0, 0.3
right = 0.4, 0, 0, 0, 0
my_stereo_signal = np.column_stack([left, right])

my_4_channel_recording = sd.playrec(my_stereo_signal, channels=4)
sd.wait()

print('shape of recording:', my_4_channel_recording.shape)
bobtak commented 2 years ago

Hi HaHeho and mgeier,

Thank you for comments! Now I see my misunderstanding clearly.

The words "input" and "output", or "upload" and "download" are often confused. These replacements may make them clear:

ex) "number of input channels" -> "number of channels to recored" "input data type" -> "data type for recording"

mgeier commented 2 years ago

Yes, that's a good point!

@bobtak Do you want to create a pull request making the suggested changes in the documentation?

I wouldn't want to change the function names and parameter names (because this would be a breaking change), but I think it would be great if we could improve the documentation.