androidx / media

Jetpack Media3 support libraries for media use cases, including ExoPlayer, an extensible media player for Android
Apache License 2.0
1.34k stars 315 forks source link

Use MediaCodec audio decoder output buffer channel mask, rather than channel count, when creating AudioTrack #1471

Open dwhea opened 1 week ago

dwhea commented 1 week ago

[REQUIRED] Use case description

Multichannel MediaCodec audio decoders generate multi-channel audio output buffers and the channel mask of these output buffers should be provided in the KEY_CHANNEL_MASK field since Android 13 (CDD see https://source.android.com/docs/compatibility/13/android-13-cdd#51_media_codecs section Audio Decoding).

Currently, ExoPlayer does not query for this info, instead using MediaFormat.KEY_CHANNEL_COUNT, and mapping this channel count to a channel position mask in androidx/media3/common/util/Util.java getAudioTrackChannelConfig(int channelCount), and then using this channel position mask to create the AudioTrack. The mapping of count to mask uses a "canonical channel mask" lookup which matches the definition from https://developer.android.com/reference/android/media/AudioFormat, with ExoPlayer recently extending this to channel counts 10 and 12.

The issue is that some MediaCodec decoders can generate audio output buffers with channel masks different from the canonical channel mask. An example would be a decoder generating https://developer.android.com/reference/android/media/AudioFormat#CHANNEL_OUT_5POINT1POINT2 output instead of the canonical 8-ch output channel mask https://developer.android.com/reference/android/media/AudioFormat#CHANNEL_OUT_7POINT1_SURROUND. ExoPlayer using canonical instead of actual channel mask can lead to either playback failure (on some systems), incorrect routing and/or incorrect mixing.

Proposed solution

ExoPlayer has extensive use of channel count, rather than mask, in the project. An envisaged solution would be to deprecate (or remove) usage of channel count through much of the media and ExoPlayer project, in favor of channel mask (in which channel count can be inferred). On systems / codecs where KEY_CHANNEL_MASK is not available (before Android 13, decoders that don't generate the key), the legacy count to mask mapping will still need to be retained. But for most of ExoPlayer internals, mask would/could be used, and count only used where mask is not defined.

Alternatives considered

Requiring MediaCodec decoders to conform to ExoPlayer's canonical channel mask definitions, even when they generate different output audio, is a tempting quick hack, but leads to audio playback issues mentioned earlier. Forbidding MediaCodec decoders from generating non canonical channel mask output is another option, but this would be restrictive to different device codec implementations.

tonihei commented 1 week ago

Thanks for the well-explained request! I agree that propagating the channel mask (not just the channel count) is the right solution here. @tianyif: Can you remember if there is already an internal or GitHub bug about this and whether we considered adding the channel mask to Format directly?

tianyif commented 3 days ago

We have an internal bug tracking an improvement of parsing the channel masks from the media containers, then adding the channel mask to Format is definitely a sub task of it. I'm keeping this as an enhancement for now.