gpac / gpac

GPAC Ultramedia OSS for Video Streaming & Next-Gen Multimedia Transcoding, Packaging & Delivery
https://gpac.io
GNU Lesser General Public License v2.1
2.77k stars 532 forks source link

Unexpected behavior with audio-only HLS #2548

Closed lonebyte closed 1 year ago

lonebyte commented 1 year ago

Detailed guidelines: http://gpac.io/2013/07/16/how-to-file-a-bug-properly/

Hi! I took the an example from https://github.com/gpac/gpac/wiki/dash_transcoding and changed it slightly:

gpac -i a.mp4 \
     @ c=vp9:b=1m \
     @@ c=vp9:b=200k \
     @ @1 -o hls/master.m3u8:segdur=6

Everything works as expected, gpac creates two variants.

But if I try the same with an audio file:

gpac -i test.flac \
     @ c=aac:b=64k \
     @@ c=aac:b=96k \
     @ @1 -o hls/master.m3u8:segdur=6

Then gpac will create three variants instead of the two that I expect.

With -r -graph it outputs:

[11:12:12.110Z] GPAC Session Status: mem      22028 kb CPU  0
test.flac (fin): test.flac:         25664844 /        25664844 (100.00)
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1028.07 FPS - EOS
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1093.01 FPS - EOS
manifest_m3u8 (fout): master_3.m3u8: done - wrote 2311 bytes
audio (rfflac): Audio (Flac Audio)       3623 pck 74302.71 FPS - EOS
ffdec:flac (ffdec): Audio (Raw media)       3623 pck 6922.22 FPS - EOS
dasher: P1 AS#1.1(A) done (57 segs) AS#1.2(A) done (57 segs) AS#1.3(A) done (57 segs)
audio (resample): Audio (Raw media)       3623 pck 6505.84 FPS - EOS
audio (resample): Audio (Raw media)       3623 pck 6218.37 FPS - EOS
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1095.25 FPS - EOS
audio (fout): test_dash57_rep1.m4s: done - wrote 244 bytes
audio (fout): test_dash57_rep2.m4s: done - wrote 244 bytes
audio (fout): test_dash57_rep3.m4s: done - wrote 244 bytes
audio (mp4mx): mux segments 56 (frags 1) next 58.000 TK1(A): 20 99 %
audio (mp4mx): mux segments 56 (frags 1) next 58.000 TK1(A): 20 99 %
audio (mp4mx): mux segments 56 (frags 1) next 58.000 TK1(A): 20 99 %
Active filters: 16

Logs:
[11:11:25.775Z] [repeated 2] [Dasher] Changing HLS target duration from 6 to 7, either increase the segment duration or re-encode the content

Filters connected:
fin (src=test.flac) (idx=1)
-(PID test.flac) rfflac (dyn_idx=5)
--(PID audio) ffdec "ffdec:flac" (dyn_idx=6)
---(PID audio) resample (dyn_idx=8)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=64k) (idx=2)
-----(PID audio) dasher (dyn_idx=7)
------(PID manifest_m3u8) fout (dst=hls/master.m3u8:segdur=6) (idx=4)
------(PID audio)         mp4mx (dyn_idx=14)
-------(PID audio) fout (dst=hls/test_dash_rep1.mp4:gfopt:segdur=6:frag:xps_inband=no:psshs=moov:mime=audio/mp4) (idx=11)
------(PID audio)         mp4mx (dyn_idx=15)
-------(PID audio) fout (dst=hls/test_dash_rep2.mp4:gfopt:segdur=6:noinit:frag:xps_inband=no:psshs=moov:mime=audio/mp4) (idx=12)
------(PID audio)         mp4mx (dyn_idx=16)
-------(PID audio) fout (dst=hls/test_dash_rep3.mp4:gfopt:segdur=6:noinit:frag:xps_inband=no:psshs=moov:mime=audio/mp4) (idx=13)
---(PID audio) resample (dyn_idx=9)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=64k:c=aac:b=96k) (idx=10)
-----(PID audio) dasher (dyn_idx=7)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=96k) (idx=3)
-----(PID audio) dasher (dyn_idx=7)

Note the weird line: (PID audio) ffenc "ffenc:aac" (c=aac:b=64k:c=aac:b=96k) (idx=10)

When I try:

gpac -i test.flac \
     @ c=aac:b=64k:#Representation=64k \
     @@ c=aac:b=96k:#Representation=96k \
     @ @1 -o hls/master.m3u8:segdur=6

Only two variants will be generated, but the log still looks like something weird is happening:

[11:13:39.415Z] GPAC Session Status: mem      26705 kb CPU  0
test.flac (fin): test.flac:         25664844 /        25664844 (100.00)
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1019.76 FPS - EOS
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1090.19 FPS - EOS
manifest_m3u8 (fout): master_2.m3u8: done - wrote 2311 bytes
audio (rfflac): Audio (Flac Audio)       3623 pck 70017.78 FPS - EOS
ffdec:flac (ffdec): Audio (Raw media)       3623 pck 7057.63 FPS - EOS
dasher: P1 AS#1.1(A) done (57 segs) AS#1.2(A) done (57 segs)
audio (resample): Audio (Raw media)       3623 pck 5754.57 FPS - EOS
audio (resample): Audio (Raw media)       3623 pck 5622.73 FPS - EOS
ffenc:aac (ffenc): Audio (MPEG-4 AAC Audio)      14491 pck 1086.80 FPS - EOS
audio (fout): test_dash57_rep1.m4s: done - wrote 428 bytes
audio (fout): test_dash57_rep2.m4s: done - wrote 244 bytes
audio (mp4mx): mux segments 56 (frags 1) next 58.000 TK1(A): 20 99 % TK2(A): 20 99 %
audio (mp4mx): mux segments 56 (frags 1) next 58.000 TK1(A): 20 99 %
Active filters: 14

Logs:
[11:13:54.247Z] [repeated 1] [Dasher] Changing HLS target duration from 6 to 7, either increase the segment duration or re-encode the content

Filters connected:
fin (src=test.flac) (idx=1)
-(PID test.flac) rfflac (dyn_idx=5)
--(PID audio) ffdec "ffdec:flac" (dyn_idx=6)
---(PID audio) resample (dyn_idx=8)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=64k:#Representation=64k) (idx=2)
-----(PID audio) dasher (dyn_idx=7)
------(PID manifest_m3u8) fout (dst=hls/master.m3u8:segdur=6) (idx=4)
------(PID audio)         \
------(PID audio)        -> mp4mx (dyn_idx=13)
-------(PID audio) fout (dst=hls/test_dash_rep1.mp4:gfopt:segdur=6:frag:xps_inband=no:psshs=moov:mime=audio/mp4) (idx=11)
------(PID audio)         mp4mx (dyn_idx=14)
-------(PID audio) fout (dst=hls/test_dash_rep2.mp4:gfopt:segdur=6:noinit:frag:xps_inband=no:psshs=moov:mime=audio/mp4) (idx=12)
---(PID audio) resample (dyn_idx=9)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=64k:#Representation=64k:c=aac:b=96k) (idx=10)
-----(PID audio) dasher (dyn_idx=7)
----(PID audio) ffenc "ffenc:aac" (c=aac:b=96k:#Representation=96k) (idx=3)
-----(PID audio) dasher (dyn_idx=7)

What am I doing wrong?

I compiled gpac and ffmpeg myself inside a docker container with alpine linux. I tried both master with FFMPEG 6.0 and v2.2.1 with FFMPEG 5.1.3

I uploaded the cut-down test file test.flac

jeanlf commented 1 year ago

indeed (not related to HLS though) two resamplers are loaded by this chain, resulting in one extra cloning of the encoder. This is now fixed on master.