mstorsjo / fdk-aac

A standalone library of the Fraunhofer FDK AAC code from Android.
https://sourceforge.net/projects/opencore-amr/
Other
1.2k stars 392 forks source link

Decoding a HE-AACv2 stream fails in FFmpeg #25

Open Vangelis66 opened 9 years ago

Vangelis66 commented 9 years ago

Hello Martin :-)

Many thanks for your hard work! I recently was able to compile on my Vista x86 laptop a non-free FFmpeg binary, with: --enable-gpl --enable-nonfree --enable-libfdk-aac using jb's build script @ https://github.com/jb-alvarado/media-autobuild_suite

I was conducting some experiments lately using libfdk-aac as an AAC decoder, overriding FFmpeg's native aac decoder, to investigate the difference, if any...

As input files I used some BBC Radio 1 live streams. While the following commands worked as expected:

ffmpeg -c:a libfdk_aac -re -i "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_intl_lc_radio1_q" -t 60 -c:a libmp3lame -b:a 128k -ar 44100 -ac 2 "BBC Radio 1 Live (56aaclc-128mp3).mp3"

(stream is encoded with AAC LC profile) and

ffmpeg -c:a libfdk_aac -re -i "http://as-hls-ww-live.bbcfmt.vo.llnwd.net/pool_7/live/bbc_radio_one/bbc_radio_one.isml/bbc_radio_one-audio=48000.m3u8" -t 60 -c:a libmp3lame -b:a 112k -ar 44100 -ac 2 "BBC Radio 1 Live (48heaac-112mp3).mp3"

(stream is encoded with HE-AACv1 profile), the next command, in which the live stream is encoded with the HE-AACv2 profile, causes FFmpeg to crash:

ffmpeg -c:a libfdk_aac -re -i "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_intl_he_radio1_q" -t 60 -c:a libmp3lame -b:a 112k -ar 44100 -ac 2 "BBC Radio 1 Live (48heaac2-112mp3).mp3"

Sometimes I get an "Error number -12 occurred" :

Stream mapping: Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame)) Press [q] to stop, [?] for help Error while decoding stream #0:0: Error number -12 occurred Last message repeated 1 times

but usually ffmpeg crashes right after I hit enter; adding -v 48 as an input option does not reveal anything useful, either. Sadly, I am not very savvy to debug this myself; I know libfdk_aac can encode with "-profile:a aac_he_v2", so I regarded it a given that it could decode, too (which is probably the case). However, this fails in my build and system... It looks as though libfdk_aac picks up the stream as AAC (LC) with SR=22050 Hz and in mono (Stream #0:0, 50, 1/28224000: Audio: aac, 22050 Hz, mono, s16, 46 kb/s), whereas using the native decoder it is correctly identified as HE-AACv2:

Stream #0:0, 50, 1/28224000: Audio: aac (HE-AACv2), 44100 Hz, stereo, fltp, 44 kb/s

Any insight will be greatly appreciated! Keep up the pristine job!

Kind regards, Vangelis.

mstorsjo commented 9 years ago

First off, the libavcodec decoder probably is a better choice for most cases. The wrapper for using fdk-aac for decoding via libavcodec was added before libavcodec itself supported the newer profiles like AAC-LD and AAC-ELD. There shouldn't be much difference - except that the fdk decoder wrapper (that is, the libavcodec integration of it, not the fdk-aac decoder itself) is less tested.

As for the missing profile info, this is just because the fdk-aac decoder wrapper doesn't expose the profile info at all.

The issue seems to be that the fdk-aac decoder first returns a few mono frames before it realizes that this should decode into stereo, and the libavcodec fdk-aac decoder wrapper doesn't support streams that change format right now.

You could try something like this:

diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c
index f789b75..ab7bdfa 100644
--- a/libavcodec/libfdk-aacdec.c
+++ b/libavcodec/libfdk-aacdec.c
@@ -328,7 +328,7 @@ static int fdk_aac_decode_frame(AVCodecContext *avctx, void *data,
         return AVERROR_INVALIDDATA;
     }

-    if (s->initialized) {
+    if (s->initialized && 0) {
         frame->nb_samples = avctx->frame_size;
         if ((ret = ff_get_buffer(avctx, frame, 0)) < 0) {
             av_log(avctx, AV_LOG_ERROR, "ff_get_buffer() failed\n");
@@ -366,7 +366,7 @@ static int fdk_aac_decode_frame(AVCodecContext *avctx, void *data,
         goto end;
     }

-    if (!s->initialized) {
+    if (!s->initialized || 1) {
         if ((ret = get_stream_info(avctx)) < 0)
             goto end;
         s->initialized = 1;
@@ -385,7 +385,7 @@ static int fdk_aac_decode_frame(AVCodecContext *avctx, void *data,
                avctx->channels * avctx->frame_size *
                av_get_bytes_per_sample(avctx->sample_fmt));

-        if (!s->anc_buffer)
+        if (!s->anc_buffer && 0)
             av_freep(&s->decoder_buffer);
     }

This is a crude hack to make it support changing stream format a bit better, at the expense of more unnecessary copying of data.

Vangelis66 commented 9 years ago

Hi again! Many thanks for taking the time to look into this!

The issue seems to be that the fdk-aac decoder first returns a few mono frames before it realizes that this should decode into stereo, and the libavcodec fdk-aac decoder wrapper doesn't support streams that change format right now.

You could try something like this:

I have recompiled with your suggested patch applied; excellent news: my ffmpeg binary does not crash anymore! The input stream is still recognised as being mono with SR=22050 Hz, the libfdk_aac decoder decodes some initial frames (ca. 0.5 sec) as such, but then proceeds to decode in stereo and with SR=44.1 KHz. Command Prompt Window output follows:

D:\FFmpeg-2.5.3-x86-static.-headers_patch.Me.13-Jan-2015>ffmpeg -c:a libfdk_aac
-re -i "http://bbcmedia.ic.llnwd.net/stream/bbcmedia_intl_he_radio1_q" -t 60 -c:
a libmp3lame -b:a 112k -ar 44100 -ac 2 "D:\Vangelis\Music\My Recordings\BBC Radi
o 1 Live (48heaac2-112mp3).mp3"
ffmpeg version 2.5.3 Copyright (c) 2000-2015 the FFmpeg developers
  built on Jan 26 2015 02:55:18 with gcc 4.9.2 (Rev2, Built by MSYS2 project)
  configuration: --arch=x86 --disable-debug --disable-shared --enable-doc --enab
le-gpl --enable-version3 --enable-runtime-cpudetect --enable-avfilter --enable-b
zlib --enable-zlib --enable-librtmp --enable-gnutls --enable-avisynth --enable-f
rei0r --enable-filter=frei0r --enable-libbluray --enable-libcaca --enable-libope
njpeg --enable-fontconfig --enable-libfreetype --enable-libass --enable-libgsm -
-enable-libilbc --enable-libmodplug --enable-libmp3lame --enable-libopencore-amr
nb --enable-libopencore-amrwb --enable-libvo-amrwbenc --enable-libschroedinger -
-enable-libsoxr --enable-libtwolame --enable-libspeex --enable-libtheora --enabl
e-libutvideo --enable-libvorbis --enable-libvo-aacenc --enable-openal --enable-l
ibopus --enable-libvidstab --enable-libvpx --enable-libwavpack --enable-libxavs
--enable-libx264 --enable-libx265 --enable-libxvid --enable-libzvbi --enable-non
free --enable-libfaac --enable-libfdk-aac
  libavutil      54. 15.100 / 54. 15.100
  libavcodec     56. 13.100 / 56. 13.100
  libavformat    56. 15.102 / 56. 15.102
  libavdevice    56.  3.100 / 56.  3.100
  libavfilter     5.  2.103 /  5.  2.103
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  1.100 /  1.  1.100
  libpostproc    53.  3.100 / 53.  3.100
Input #0, aac, from 'http://bbcmedia.ic.llnwd.net/stream/bbcmedia_intl_he_radio1
_q':
  Metadata:
    icy-br          : 48
    icy-genre       : BBC Radio Live
    icy-name        : BBC Radio 1
    icy-pub         : 1
    icy-reset       : 1
    StreamTitle     :
  Duration: N/A, bitrate: 45 kb/s
    Stream #0:0: Audio: aac, 22050 Hz, mono, s16, 45 kb/s
Output #0, mp3, to 'D:\Vangelis\Music\My Recordings\BBC Radio 1 Live (48heaac2-1
12mp3).mp3':
  Metadata:
    icy-br          : 48
    icy-genre       : BBC Radio Live
    icy-name        : BBC Radio 1
    icy-pub         : 1
    icy-reset       : 1
    StreamTitle     :
    TSSE            : Lavf56.15.102
    Stream #0:0: Audio: mp3 (libmp3lame), 44100 Hz, stereo, s16p, 112 kb/s
    Metadata:
      encoder         : Lavc56.13.100 libmp3lame
Stream mapping:
  Stream #0:0 -> #0:0 (aac (libfdk_aac) -> mp3 (libmp3lame))
Press [q] to stop, [?] for help
Input stream #0:0 frame changed from rate:22050 fmt:s16 ch:1 chl:mono to rate:44
100 fmt:s16 ch:2 chl:stereo
size=     821kB time=00:01:00.01 bitrate= 112.1kbits/s
video:0kB audio:821kB subtitle:0kB other streams:0kB global headers:0kB muxing o
verhead: 0.067467%
As you've said, it is a crude hack; and the caveat is that I have to explicitly specify " -ar 44100 -ac 2" as (output) options in my ffmpeg command; otherwise, I end up with a mono 22050 KHz mp3 transcode (I believe this is expected ffmpeg behaviour: if -ar & -ac are not specified, their values are copied over from the input file; when using the default (native) libavcodec decoder, input stream is correctly identified and its properties "44100 Hz, stereo" are simply carried over to output file...). > the libavcodec decoder probably is a better choice for most cases. > There shouldn't be much difference This is reassuring coming from the "lips" of a top expert! I just (naively) thought that since the libfdk_aac encoder is superior by far to the native aac encoder, the same would apply to the decoder part of the library... > except that the fdk decoder wrapper (that is, the libavcodec integration of it, > not the fdk-aac decoder itself) is less tested. So, just another tester here! Stick to the native libavcodec AAC decoder then it is...! If you feel that any of my findings needs to be shared with the ffmpeg devs, feel free to do so - I won't pursue this further myself; your help has been invaluable... All the best. Vangelis.
mstorsjo commented 9 years ago

As you've said, it is a crude hack; and the caveat is that I have to explicitly specify " -ar 44100 -ac 2" as (output) options in my ffmpeg command

Actually, what I meant about crude hack is the way the patch was implemented (by just force-overriding a few branches in the code). Whether it detects it as mono or stereo is a different issue.

The fact that fdk-aac seems to default to mono in these cases and later switches isn't so much of a hack, as to a design decision within fdk-aac I would guess. With HE-AACv2 (aka parametric stereo) it isn't always too easy to determinate whether a stream is mono or stereo. In this case, fdk-aac seems to notice only after a few packets (I don't know the internals well enough to judge whether this is as good as it gets or if it could be done sooner, but I assume it would be done sooner if it was possible).

libavcodec's AAC decoder on the other hand does the opposite decision; if unsure, it decodes to stereo. (This behaviour was added when that decoder got support for HE-AACv2.) If you feed it a mono HE-AACv1 stream, it will actually be decoded as stereo - probably for the same reason, that instead of switching after a few frames (after realizing it changed), it prefers decoding to stereo if there's any chance that it will be HE-AACv2. (Some users tend to report this as a bug in fdk-aac, but it isn't, it's a design decision within the libavcodec decoder, and other decoders would behave differently.)

(After taking another look at that, it does seem that fdk-aac also decodes mono HE-AACv1 as stereo directly from the start, so I'm not quite sure exactly what is going on with these BBC streams. Suffice to say that for some reason it doesn't know it's stereo until a few frames down, while libavcodec's internal decoder goes to stereo directly from the start.)

I just (naively) thought that since the libfdk_aac encoder is superior by far to the native aac encoder, the same would apply to the decoder part of the library...

Well, it depends. For decoders, "quality" is less of an issue if you compare two different decoders that both support a certain spec (and decode accordingly) - the only things that matter then is performance and how well it is integrated into a certain environment. For fdk-aac, it did support a number of features that libavcodec's AAC decoder didn't support initially (like AAC-LD and AAC-ELD), but libavcodec's internal decoder has caught up with most of these features now. So unless you're trying to decode some uncommon standard version that fdk-aac supports better, you're probably better off using libavcodec's internal decoder.

Vangelis66 commented 9 years ago

what I meant about crude hack is the way the patch was implemented (by just force-overriding a few branches in the code). Whether it detects it as mono or stereo is a different issue.

Extremely thankful Martin for your additional comments and explanation; I'll leave it up to you to close this issue, in case in the future you come up with a "less crude hack" (AFAIAC, your patch solved my problem...)! Kind regards, Vangelis

gmittal42 commented 9 years ago

Hi Martin, I faced the same issue and tried to solve it before finding this post. I found that the main issue seemed to be that a SBR parsing error was being ignored because fIsFillElement was set to 1 in a call to CAacDecoder_ExtPayloadParse. Since ffmpeg relies on the first few frames being decoded correctly in order to construct the stream info, would it be appropriate to not ignore the error even though, as the comment in the code suggests, decoding can proceed without error?