MediaArea / MediaInfoLib

Convenient unified display of the most relevant technical and tag data for video and audio files.
https://mediaarea.net/MediaInfo
BSD 2-Clause "Simplified" License
617 stars 169 forks source link

Stereo FDK-AAC HEv2 shows only one 'C' in 'Channel layout' #1453

Open eugenesvk opened 2 years ago

eugenesvk commented 2 years ago

For some reason MediaInfo shows only one channel C in Channel layout for a stere file when it's encoded using the aac_he_v2 profile of the FDK-AAC encoder, please see the script below where I convert a generated wav with:

MediaInfo is different for these two files:

Maybe I'm misunderstanding what this layout means, but shouldn't it always be two channels for a stere file there (in both cases the Channel(s) field has the correct value of 2 channel)?

# 1. Generate a test 5-second wav file
ffmpeg -f lavfi -i "sine=frequency=1000:duration=5" -ac 2 "sine2ch5s.wav"

# 2. Convert sine2ch5s.wav to FDK-AAC using the HEv2 profile
ffmpeg -i "sine2ch5s.wav"   \
  -c:a "libfdk_aac"         \
  -afterburner 1            \
  -vbr 3                    \
  -profile:a "aac_he_v2"    \
  "sine2ch5s_wav_AAC-FDK_HEv2_vbr3_ffmpeg.m4a"

# 3. Convert sine2ch5s.wav to FDK-AAC using the LC profile
ffmpeg -i "sine2ch5s.wav"   \
  -c:a "libfdk_aac"         \
  -afterburner 1            \
  -vbr 3                    \
  -profile:a "aac_low"      \
  "sine2ch5s_wav_AAC-FDK_LC_vbr3_ffmpeg.m4a"
JeromeMartinez commented 2 years ago

We don't have libfdk_aac, please attach a sample file.

Maybe I'm misunderstanding what this layout means, but shouldn't it always be two channels for a stere file there (in both cases the Channel(s) field has the correct value of 2 channel)?

We rely on file metadata, I'll check the metadata in the resulting file. a wrong metadata "C" somewhere would not be a surprise because HE-AAC is basically a mono stream (and handled as mono by decoder supporting AAC but not supporting HE-AAC) with spectral band replication.

eugenesvk commented 2 years ago

Sure, here is a sine2ch5s_wav_AAC-FDK_HEv2_vbr3_ffmpeg.m4a file (though I thought a non-file-based solution would be simpler to replicate)

By the way, HE (that is, AAC LC SBR without PS Parametric Stereo) shows the same L R as LC, so it might not be the SBR to blame, but the PS

Now that you mentioned metadata, I've digged a bit into the Debug mode in MediaInfo and there is indeed a difference between HEv2

channelConfiguration:       1 (0x1) - (4 bits) - Front: C

and HE (or LC)

channelConfiguration:       2 (0x2) - (4 bits) - Front: L R

though all 3 files have the same channel count

channelcount (2):                2 (0x0002)

and from PS wiki

An AAC HE v2 bitstream is obtained by downmixing the stereo audio to mono at the encoder along with 2–3 kbit/s of side info (the Parametric Stereo information) in order to describe the spatial intensity stereo generation and ambience regeneration at the decoder.

So you're right, it's indeed one channel with extra info to get back to two channels, and indeed nothing's wrong with MediaInfo Thanks for pointing me to the right direction!

JeromeMartinez commented 2 years ago

(though I thought a non-file-based solution would be simpler to replicate)

It is good but in that case I was not having the right config. Copy: sine2ch5s_wav_AAC-FDK_HEv2_vbr3_ffmpeg.zip

so it might not be the SBR to blame, but the PS

Oops, I mixed up both. True, I was speaking about Parametric Stereo.

Now that you mentioned metadata,

Yes and no: it is implicit signaling (no info in the descriptor) but it is still all about AAC, so I consider that we should correctly provide the "L R" info as we catch PS (both are linked). I was thinking about an extra metadata in e.g. the MP4, but it is not the case. So this is a coherency issue, I reopen the ticket.