xiph / flac

Free Lossless Audio Codec
https://xiph.org/flac/
GNU Free Documentation License v1.3
1.63k stars 277 forks source link

FLAC in MP4 and Ogg specs: how to signal custom channel layout #636

Closed jprjr closed 1 year ago

jprjr commented 1 year ago

Hi there, I had two questions around specifications.

The IETF FLAC draft defines how the WAVEFORMATEXTENSIBLE_CHANNEL_MASK Vorbis Comment is to be interpreted for a native FLAC file.

For Ogg:

I noticed the reference FLAC encoder will set WAVEFORMATEXTENSIBLE_CHANNEL_MASK when encoding to an Ogg file, and honor the field when decoding from an Ogg file. The Ogg Mapping doesn't directly mention handling WAVEFORMATEXTENSIBLE_CHANNEL_MASK, and when referring to the FLAC specification, it links to "FLAC - Format", which also doesn't mention WAVEFORMATEXTENSIBLE_CHANNEL_MASK (though it does provide a link to the IETF draft).

I just wanted to verify, should FLAC decoders honor the WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment in Ogg files?

For MP4 / ISO BMFF

The Encapsulation of FLAC in ISO Base Media File Format document provides "FLAC - Format" as a normative reference to the FLAC format, so like Ogg, there's an additional "hop" between the ISO BMFF spec and the definition of WAVEFORMATEXTENSIBLE_CHANNEL_MASK, as opposed to a direct link. Additionally, MP4 has a box (chnl) available for specifying a channel layout.

How should custom channel layouts for FLAC streams in MP4 files be signaled? The spec allows for multiple FLAC metadata blocks to be in the FLAC Specific Box, should decoders look for a Vorbis Comment block and parse the WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment? Or should the channels be signaled via a chnl box? Or something else entirely?

ktmf01 commented 1 year ago

This is the wrong place for this issue. https://github.com/ietf-wg-cellar/flac-specification/issues would be more appropriate. I am unable to transfer it however.

The IETF FLAC draft defines how the WAVEFORMATEXTENSIBLE_CHANNEL_MASK Vorbis Comment is to be interpreted for a native FLAC file.

The document says nowhere that this part of the specification is specific to non-encapsulated FLAC. All other parts of the specification are valid in a container, so why wouldn't this bit be too?

jprjr commented 1 year ago

For background: the reason I'm asking this is I opened a bugs with ffmpeg about not honoring the WAVEFORMATEXTENSIBLE_CHANNEL_MASK Vorbis Comment (one bug for Ogg, one bug for MP4). There's no response on the Ogg bug, and the MP4 bug got a response of "this isn't a bug."

I read through the documentation and I think where I got hung up with the IETF FLAC draft is in the opening abstract (emphasis mine):

This document defines the Free Lossless Audio Codec (FLAC) format and its streamable subset.

I interpreted that to mean it's defining the codec and native FLAC file format. That might be from me coming from the background of usually interpreting "format" to mean "encapsulation" (like Ogg, MP4, MKV, etc), and "codec" to mean encoded packetized audio (like Vorbis, Opus, etc).

Just to be completely honest, the thing I'm more interested in is getting the clarification that yes - to be compliant with the specs means parsing and honoring the WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment when encapsulated in Ogg and MP4 files (and presumably, any other current or future encapsulation that allows Vorbis Comment blocks), so I can take that back to the ffmpeg bugs.

I do apologize that I'm about to ask the question again. I'm not trying to be antagonistic or otherwise any kind of hassle or annoyance. I just want things to be crystal clear.

The two main questions I asked were:

  1. Should FLAC decoders honor the WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment in Ogg files?
  2. How should custom channel layouts for FLAC streams in MP4 files be signaled?

The response above was:

The document says nowhere that this part of the specification is specific to non-encapsulated FLAC. All other parts of the specification are valid in a container, so why wouldn't this bit be too?

I'm around 99% sure that's implying that since all parts of the specification are considered valid for all encapsulations, all encapsulations should honor the comment. It's just that it's an implied answer (answering a question with another question).

On the internet we have language barriers, differing backgrounds, etc, so I try to avoid implications/assumptions and to be direct. So with that in mind I just want to clarify that the answers to my questions are:

  1. Yes, FLAC decoders should honor the WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment in Ogg files.
  2. Custom channel layouts for FLAC streams in MP4 files should be signaled with a WAVEFORMATEXTENSIBLE_CHANNEL_MASK comment, in a Vorbis Comment block, within the FLAC Specific Box.

Thank you in advance.

ktmf01 commented 1 year ago

The document defines the FLAC format and some encapsulations. The encapsulations work by just taking parts of the FLAC stream and encapsulating them. Some parts are duplicated (like coded number and granule position in Ogg) and then they should be the same. In Ogg there is no way to specifically signal channel ordering, so channel mask in vorbis comments should be honored

I have next to no knowledge of MP4. I tried to understand how FLAC-in-MP4 works and summarized https://github.com/xiph/flac/blob/master/doc/isoflac.txt The ffmpeg people have much more knowledge of MP4 than I do. I'd say, if MP4 indeed has something that conveys the same information as WAVEFORMAT_CHANNEL_MASK, than that MP4-specific way has priority but WAVEFORMAT_CHANNEL_MASK should agree with it.

jprjr commented 1 year ago

Great! Thank you for taking the time, I'll close this issue out.