mozilla / cubeb

Cross platform audio library
ISC License
439 stars 124 forks source link

Multi-channel support #178

Open kinetiknz opened 7 years ago

kinetiknz commented 7 years ago

This issue tracks the big picture of multi-channel support. I'll update this later with a better summary.

The main items are:

Right now, there's some discussion in pull #171 covering this that needs to be moved into this issue.

ChunMinChang commented 7 years ago

As an independent cross-platform audio library, it's better to handle the downmix/upmix inside cubeb, rather than Gecko's AudioConverter, to tackle the incompatibility of the channel number. However, it has to figure out what capability of downmix/upmix needed in cubeb. Does it need a general purpose conversion from m channels to n channels, where m≠n, or just some specific conversion like k channels to stereo, where 2 < k < 9, and stereo to 5.1 ?

A combination of some specific conversion may be more feasible. It seems that the current downmix code in AudioConverter already can convert from k channels into stereo. For audio 5.1, Table 2 in ITU-R BS.775-3[0] provides a downmix coefficients matrix to convert from 3/2 to 1/0, 2/0, 3/0, 2/1, 3/1, 2/2, where x/y stands x front channels with y rear or surround channels[1]. However, it might need more work for conversion of stereo to 5.1 audio[2].

References: [0] ITU-R BS.775-3 [1] Dolby_Digital [2] Real-Time Conversion of Stereo Audio to 5.1 Channel Audio for Providing Realistic Sounds

ChunMinChang commented 7 years ago

Beyond the implementation for each backend, I think the general design should follow the below comments.

ChunMinChang commented 7 years ago

This feature could be divided into several phases:

ChunMinChang commented 7 years ago

Downmix/Upmix module The downmix/upmix code should be separated into a independent module from WASAPI for following reasons:

padenot commented 7 years ago

Agreed, we should separate it.

ChunMinChang commented 7 years ago

I think we could design three mechanisms for mixing:

Each time when we try to upmix or downmix, we need to try converting data with the above order. That is, we can try specific conversion first. If it wroks, then the job is done. Otherwise, we next try mixing by mapping the channel. If it still doesn't work, then we try mixing by bypassing the channel data. The final mechanism should be our fallback plan and it should always work.

Specific conversion Some conversion has its own definition, so we need to implement this. For example, Table 2 in ITU-R BS.775-3 define the downmix equations from 3F2 to 1F, 2F, 3F, 2F1, 3F1 and 2F2.

Mixing by mapping channel data In most cases, the input and output data can be mapped by its layout setting. For example, if we try downmixing from 3F(L, R, C) to Stereo(L, R), we only need to pass the first two input channel data to output.

Mixing by bypassing channel data There is some case the above mechanisms don't cover. The downmix from stereo(L, R) to mono(M) is an example. There is no spec and there is no matched channel for this conversion. Especially, WASAPI can support some unmatched speaker settings like 6 channels with stereo layout(stereo should only has 2 channels). In such case, we don't know the mixing policy should follow the layout or channel number.

The simplest plan is to follow its channel numbers. If the input has 2 channels and output has 1 channel, then we just need to pass the first data to the output. We just need to pass the channel data by channel index.

An alternative way is to define some matrices to compress/expand the audio data. However, the combination is not a small number.

ChunMinChang commented 7 years ago

For testing, I am wondering if it's feasible to fake an audio device and programmatically register it as default audio output. Then we can intercept and verify the output through the faked device. Maybe this issue should be discussed in #193.