Closed JocelynDelalande closed 1 year ago
this is mostly my first try at live transcoding with ffmpeg so I just went with the most compatible solution first, I did think about proxying the AAC from the m3u8 but I still think some transcoding is required mostly to enable seeking. The AAC stream from Twitch is in a variable bitrate and seeking is only possible because m3u8 playlist allow seeking by time, while the HTTP standard at the moment only offers seeking by byte count, if I were to direct stream the variable bitrate AAC stream from Twitch then seeking would lose in accuracy by a lot, it would also make the stream glitch horribly every time there is a drop in the connection, right now it already happens but it only skips 1 second in time, with a variable bitrate it could go up by a lot.
so you would still need to transcode to a fixed bitrate AAC stream if you care about seeking. I will look in to it though and do some tests, and maybe give the option to edit to what format to transcode too. maybe even check some benchmarks to see what's if transcoding from variable to fixed bitrate AAC is quicker then mp3 transcoding
Thanks for your detailed answer @madiele :)
for seek on streamed file : HTTP byte range is used… but time<->byte count mapping is accessible via m3u8 data, isn't it ?
WDYT ?
this is a m3u8 file supplied by twitch, unfortunately it only has time data for each chunk, and even if I was able to make an accurate conversion most client will probably display a wrong duration if I supply the VBR AAC stream, I will do some testing though, and see if I'm right to be skeptical
so, I did some tests, the bitrate is indeed variable, I tried downloading the clean AAC stream with ffmpeg and even with VLC playing the downloaded file the duration is estimated by averaging the bitrate, and if I try to play via streaming in a browser the total duration is off by 200 hours on a 10 hour stream 😁 since Twitch does not put the total length in the AAC header, I think it's possible to inject it in the stream but have no idea how, even if I manage to do it the seek problem is still there, and since Twitch streams can be even 10-15 hours long the error margin on seek will be huge
you can test it using this stream that I exported, get the direct link to the file and try playing in various clients, results vary quite a lot and are quite bad everywhere, with seeking being almost unusable: https://www.mediafire.com/file/evfsawliq0y5q7t/out2.aac/file
the original is 8h12m09s long for reference
edit: podcast addict duration is 48 h instead of 8, seek is very inaccurate podcast republic refuses to play antennapod duration is 45 h instead of 8, seek is very inaccurate vlc duration varies between 6-12 hours instead of 8, seek is borderline usable
when I get around to give the option to choose the encoding I will give it as an undocumented option only for those who like suffering
during research I also discovered that ffmpeg has a skip silent audio filter you can enable if you want, I'm definitely adding that as an option in the future!
those are the benchmark for transcoding on a raspberry pi 3b+
it does seem that aac is faster overall, but times are so good anyway it hardly matters (if it's this good on a pi, it's probably negligible in 95% of use cases), I will keep mp3 as the default encoding as it's the safest bet for compatibility, but will add an option to chose aac encoding down the line
duration 00:17:04
choose a transcoding setting that takes less than the duration of the vod
benchmarking VBR aac to 64k CBR mp3
real 2m46.382s
user 2m42.487s
sys 0m0.519s
---------------------
benchmarking VBR aac to 64k CBR aac
real 1m46.131s
user 1m42.682s
sys 0m0.706s
---------------------
benchmarking VBR aac to 128k CBR mp3
real 2m46.725s
user 2m42.216s
sys 0m0.977s
---------------------
benchmarking VBR aac to 128k CBR aac
real 2m17.238s
user 2m13.330s
sys 0m0.648s
---------------------
benchmarking VBR aac to 256k CBR mp3
real 2m41.647s
user 2m36.211s
sys 0m1.133s
---------------------
benchmarking VBR aac to 256k CBR aac
real 2m25.688s
user 2m20.114s
sys 0m1.184s
---------------------
benchmarking VBR aac to 320k CBR mp3
real 2m34.989s
user 2m31.495s
sys 0m0.908s
---------------------
benchmarking VBR aac to 320k CBR aac
real 2m52.319s
user 2m46.826s
sys 0m1.276s
---------------------
*To note: most Twitch streams are under 200k average bitrate, so encoding above it it's kinda useless
Interesting, congrats :)
Do you have any client in mind that lacks AAC support ?
(Just asking because a lot of podcasts seem to use aac as default in the wild, and quality efficiency is better)
Nevermind. I checked and not that much use of aac for podcast files after all...
Working on m4a audio streams from twitch, I stumbled uppon this re-packaging command that does not reencode (so very light on resources and instantaneous on 2h audio sample):
ffmpeg -i in.aac -c:a copy -bsf:a aac_adtstoasc out.m4a
For instance it produce an audio file that firefox/chrome can read (impossible with the bare .aac).
mediainfo tells that about input vs output :
$ mediainfo in.aac
General
Complete name : aHR0cHM6Ly93d3cudHdpdGNoLnR2L3ZpZGVvcy8xMjAxMjI1Nzg5.aac
Format : ADTS
Format/Info : Audio Data Transport Stream
File size : 43.3 MiB
Overall bit rate mode : Variable
Audio
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Format version : Version 4
Codec ID : 2
Bit rate mode : Variable
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 43.3 MiB (100%)
$ mediainfo out.m4a
General
Complete name : /tmp/audio.m4a
Format : MPEG-4
Format profile : Apple audio with iTunes info
Codec ID : M4A (M4A /isom/iso2)
File size : 42.4 MiB
Duration : 2 h 0 min
Overall bit rate mode : Constant
Overall bit rate : 49.4 kb/s
Writing application : Lavf58.45.100
Audio
ID : 1
Format : AAC LC
Format/Info : Advanced Audio Codec Low Complexity
Codec ID : mp4a-40-2
Duration : 2 h 0 min
Bit rate mode : Constant
Bit rate : 48.0 kb/s
Channel(s) : 2 channels
Channel layout : L R
Sampling rate : 44.1 kHz
Frame rate : 43.066 FPS (1024 SPF)
Compression mode : Lossy
Stream size : 41.2 MiB (97%)
Default : Yes
Alternate group : 1
To be honest I do not fully understand and am being a bit suspicious about how we can go from VBR to CBR without recoding the bistream, but I share it with you in case it may help.
Weird yes, thanks for the info I will look into it as soon as I have some time to test it out
nope, tried to stream it and seeking was wrong by 1 minute after 30 minutes, and vlc still labels the stream as vbr, this could be a case where mediainfo is just wrong on the analysis
Yeah,that would seem more logical: mediainfo being wrong. Thanks for testing :)
Podcast clients seem to support AAC¹ format even if they do not support M3U8 (except podcast addict).
Won't proxying a « bare » AAC stream be enough (and probably more light on resource usage than transcoding to MP3) ?
I'm not really into audio stuff, so maybe this is a stupid question.