dank074 / Discord-video-stream

Experiment for making video streaming work for discord selfbots.
183 stars 35 forks source link

~1-2 second audio desync (video is ahead of audio) #31

Closed devoxin closed 1 year ago

devoxin commented 1 year ago

After a lot of tweaking, I have been able to produce usable audio and video from a single ffmpeg process by changing my ffmpeg arguments to the following:

if (includeAudio) {
  const audioStream = new AudioStream(mediaUdp);
  const opus = new opus.Encoder({ channels: 2, rate: 48000, frameSize: 960 });

  command.output(StreamOutput(videoOutput).url, { end: false })
    .size(`${streamOpts.width}x${streamOpts.height}`)
    .fpsOutput(streamOpts.fps)
    .videoBitrate(`${streamOpts.bitrateKbps}k`)
    .format('h264')
    .outputOptions([
      '-tune zerolatency',
      '-pix_fmt yuv420p',
      '-preset ultrafast',
      '-profile:v baseline',
      `-g ${streamOpts.fps}`,
      `-x264-params keyint=${streamOpts.fps}:min-keyint=${streamOpts.fps}`,
      '-bsf:v h264_metadata=aud=insert',
      '-map 0:0'
    ])
    .output(StreamOutput(opus).url, { end: false })
    .noVideo()
    .outputOptions('-map 0:1'.split(' '))
    .audioChannels(2)
    .audioFrequency(48000)
    .format('s16le')

  opus.pipe(audioStream, { end: false });
}

This is great, except when streaming via play-live followed by an m3u8 URL, there is about 1-1.5 seconds of audio delay (I have only tested with m3u8 URLs so far). I wondered if this was perhaps related to the keyframe options, right now the fps is set to 24 to represent the typical fps of a movie. The video plays before the audio, so if ~24 frames have elapsed before the audio plays, that would suggest a 1 second delay which could indicate that keyframes need to be set. I don't have significant experience with h264 and keyframes, hence this issue -- have you any potential suggestions on how to reduce this desync?

Edit: adding '-vf tpad=start_duration=1100ms', to outputOptions gets it very close but I don't think it's perfect. Possibly within 50-100ms. This also seems somewhat hacky, running ffplay <m3u8 url> yields a delay-free stream; audio and video are synced perfectly. Discord's screen-share can sound pretty spot on so I'm optimistic that the same can be achieved here.

dank074 commented 1 year ago

Keyframes are used by the video decoder to assemble a full picture. if you space them out, for example, discord would show a black screen for a while until the first key frame arrives but audio would still play fine.

Not sure what's causing the desync audio/video I'll have to test. I've often played m3u8 live streams and from my experience if you get desynched audio you could simply stop it and start it again and usually it fixes it. That's not a real fix though because ideally it would work every time

Oh and btw, the original code uses a single ffmpeg process already without any tweaks

devoxin commented 1 year ago

the original code uses a single ffmpeg process already without any tweaks

My bad! I suppose I misread the code to be spawning two instances of ffmpeg (my excuse is that it was late and I was trying to make TypeScript work and figure out how fluent-ffmpeg works haha).

If if helps with debugging, some additional information:

dank074 commented 1 year ago

I just tested with two different m3u8 streams and only one of those is having desync issues. Not sure what's different between them (they're the exact same audio and video codecs (AAC audio/ H264 video). I also tested VP8 and it's having the same desync issue as h264. Also played with the fps and didn't change anything. This is weird as hell

dank074 commented 1 year ago

Ok I think I might have found the problem

Broken stream: image

Functional stream: image

Notice the EXT-X-DISCONTINUITY-SEQUENCE tag missing from the functional stream.

Discussion about ffmpeg's issues handling that tag is still ongoing (last message about it was 5 months ago) but I don't know if we can expect a fix anytime soon since issue has been open for 7+ years now https://trac.ffmpeg.org/ticket/5419

However, I could be mistaken and this could not be the issue at all.

You said that you had

a handful of mkv and mp4 local files

that were also broken. Can you share any details about these files?

devoxin commented 1 year ago

However, I could be mistaken and this could not be the issue at all.

Just checked my source, there doesn't appear to be EXT-X-DISCONTINUITY tag in there. The link, for reference, is https://moviesphere-plex.amagi.tv/playlist.m3u8

that were also broken. Can you share any details about these files?

I can share ffprobe readings but cannot share the files themselves as they are full length movies unfortunately.

Movie 1, mkv ``` Input #0, matroska,webm, from 'movie 1.mkv': Metadata: encoder : libebml v1.3.7 + libmatroska v1.5.0 creation_time : 2019-06-04T09:22:56.000000Z Duration: 01:28:17.27, start: 0.000000, bitrate: 1109 kb/s Stream #0:0: Video: h264 (High), yuv420p(progressive), 1280x534, SAR 1:1 DAR 640:267, 23.98 fps, 23.98 tbr, 1k tbn (default) Metadata: BPS-eng : 1027772 DURATION-eng : 01:28:17.209000000 NUMBER_OF_FRAMES-eng: 127006 NUMBER_OF_BYTES-eng: 680541008 _STATISTICS_WRITING_APP-eng: mkvmerge v33.1.0 ('Primrose') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2019-06-04 09:22:56 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES Stream #0:1(eng): Audio: aac (LC), 48000 Hz, stereo, fltp Metadata: BPS-eng : 79808 DURATION-eng : 01:28:17.259000000 NUMBER_OF_FRAMES-eng: 248309 NUMBER_OF_BYTES-eng: 52845805 _STATISTICS_WRITING_APP-eng: mkvmerge v33.1.0 ('Primrose') 64-bit _STATISTICS_WRITING_DATE_UTC-eng: 2019-06-04 09:22:56 _STATISTICS_TAGS-eng: BPS DURATION NUMBER_OF_FRAMES NUMBER_OF_BYTES Stream #0:2: Video: png, rgba(pc), 500x354 [SAR 3780:3780 DAR 250:177], 90k tbr, 90k tbn (attached pic) Metadata: mimetype : image/png ```
Movie 2, mkv ``` Input #0, matroska,webm, from 'movie 2.mkv': Metadata: encoder : libebml v1.4.0 + libmatroska v1.6.1 Duration: 01:50:53.47, start: 0.000000, bitrate: 7259 kb/s Stream #0:0: Video: h264 (High), yuv420p(tv, bt709/unknown/unknown, progressive), 1920x1080 [SAR 1:1 DAR 16:9], Closed Captions, 23.98 fps, 23.98 tbr, 1k tbn (default) Stream #0:1(eng): Audio: eac3, 48000 Hz, 5.1(side), fltp, 192 kb/s (default) ```

I had tested some short mp4 files as well but I can't remember which ones they were

dank074 commented 1 year ago

Ok I think I might have figured out the fix. If you remove the -re flag then it fixes the m3u8 stream you sent as well as the broken one I was testing yesterday. Not sure if this will fix your mkv files but for non-livestreams this could make the program consume a lot more RAM since ffmpeg is going to read the file as fast as it can and push it into our output streams

Update: Nvm about the RAM, just tested it with a local file and it looks the same as before hmm