SamuelScheit / puppeteer-stream

A Library for puppeteer to retrieve audio and video streams of webpages
MIT License
365 stars 116 forks source link

Issue with delay, trying to stream to a client in realtime #177

Open stoefln opened 2 days ago

stoefln commented 2 days ago

I am trying to create a close-to-realtime streaming implementation. The user should be able to remotely control the browser, so the streaming delay needs to be minimal. I tried different solutions, the most straight forward one is the following, but for some reason the video does not start playing, even though a lot of data has been streamed already.

Screenshot 2024-11-17 at 09 30 04

Here is my (next.js) client code:

import React, { useState, useEffect, useRef } from 'react'

const VideoTestPage = () => {
  const videoRef = useRef(null)

  return (
    <div>
      <video
        ref={videoRef}
        src="/api/browsers/direct-video-test"
        controls
        autoPlay={true}
        style={{ width: '100%', height: 'auto' }}
        preload="none"
        muted // Ensure autoplay works
      ></video>
    </div>
  )
}

export default VideoTestPage

And here the API endpoint code:

import { getStream, launch } from 'puppeteer-stream'

let browsers = {}

const startStreamingBrowser = async (userId, size, url) => {
  const { width = 800, height = 600 } = size

  let stream = null
  try {
    let browser = browsers[userId]
    let page
    if (!browser) {
      console.log('start new browser')
      browser = await launch({
        //executablePath: utils.getExecutablePath(),
        executablePath: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome',
        defaultViewport: {
          width: parseInt(width, 10),
          height: parseInt(height, 10)
        }
      })

      console.log('create new page')
      page = await browser.newPage()
      browser.page = page
      browsers[userId] = browser
    } else {
      page = browser.page
    }

    console.log('go to url: ', url)
    await page.goto(url)

    stream = await getStream(page, { audio: true, video: true, frameSize: 200 })

    stream.on('close', async () => {
      await browser.close()
      browsers[userId] = null
    })

    stream.on('error', async error => {
      console.error('Error occurred:', error)
      await browser.close()
      browsers[userId] = null
    })

    stream.on('end', async () => {
      console.log('Stream end -> closing browser instance')
      try {
        await browser.close()
      } catch (error) {
        console.log('Closing browser failed. Maybe browser was closed already manually...')
      }
      browsers[userId] = null
    })
  } catch (error) {
    console.error('Error occurred:', error)
  }

  return stream
}

const stopStreamingBrowser = async userId => {
  const browser = browsers[userId]
  if (browser) {
    await browser.close()
    browsers[userId] = null
  }
}

export default async function handler(req, res) {
  console.log('browser endpoint', req.method)
  const userId = 'test'
  const stream = await startStreamingBrowser(
    userId,
    { width: 800, height: 600 },
    'https://giphy.com/clips/studiosoriginals-happy-birthday-grandson-Ny5rE4B0R6i1vqOOaj'
  )

  // Stream headers
  res.setHeader('Content-Type', 'video/mp4'); 
  res.setHeader('Transfer-Encoding', 'chunked');
  res.setHeader('Connection', 'keep-alive');

  stream.pipe(res)
}

export const config = {
  api: {
    responseLimit: false,
  },
}

Any ideas how I could get this working?

stoefln commented 23 hours ago

Oh, I managed to get the streaming working by setting mimeType: 'video/webm;codecs=vp8' in the getStream options. But there is still a 4-second delay between browser interaction (stream recording) and displaying in the video. Any idea?

stoefln commented 22 hours ago

This is what ffprobe tells me about the video format:

ffprobe version 7.1 Copyright (c) 2007-2024 the FFmpeg developers
  built with Apple clang version 16.0.0 (clang-1600.0.26.3)
  configuration: --prefix=/opt/homebrew/Cellar/ffmpeg/7.1 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags='-Wl,-ld_classic' --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libharfbuzz --enable-libjxl --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox --enable-audiotoolbox --enable-neon
  libavutil      59. 39.100 / 59. 39.100
  libavcodec     61. 19.100 / 61. 19.100
  libavformat    61.  7.100 / 61.  7.100
  libavdevice    61.  3.100 / 61.  3.100
  libavfilter    10.  4.100 / 10.  4.100
  libswscale      8.  3.100 /  8.  3.100
  libswresample   5.  3.100 /  5.  3.100
  libpostproc    58.  3.100 / 58.  3.100
Input #0, matroska,webm, from '/Users/.../.next/server/pages/api/browsers/test-stream-output.webm':
  Metadata:
    encoder         : Chrome
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0(eng): Video: vp8, yuv420p(tv, bt709, progressive), 1728x1116, SAR 1:1 DAR 48:31, 1k tbr, 1k tbn (default)
      Metadata:
        alpha_mode      : 1
  Stream #0:1(eng): Audio: opus, 48000 Hz, stereo, fltp (default)
SamuelScheit commented 21 hours ago

I'm having a look at this issue and can replicate it. to improve the latency a bit, you can set the frameSize argument to 1, which should try to send the browser video data, as soon as it is available. However there is a inherent delay with the MediaRecorder api and the webm format. I'll have to rewrite the package to use WebRTC to minimize the latency.

SamuelScheit commented 16 hours ago

@stoefln I've now implemented it using webrtc and now you can see it near realtime (80ms delay)

https://github.com/user-attachments/assets/7d9a6817-4416-41e9-b291-72ade1e97968

Can you contact me to discuss further details?

SamuelScheit commented 10 hours ago

I've tried these approaches:

  1. using puppeteer.startScreencast()
    • no audio
    • bad latency with good (100% quality) png stream
    • good latency with bad (50% quality) jpeg stream
    • lagged behind when many moving objects (video, scrolling)
  2. Browser capture and transfer via WebRTC to Server with ffplay as display Had 90ms delay because of ffplay + piping + converting images
  3. Browser capture and transfer via WebRTC to other Browser using PeerConnection
    • Almost no delay (~10ms)
    • Audio support
    • good quality video

I advice anyone needing to display a browser capture directly in another browser to use approach 3.

stoefln commented 3 hours ago

Amazing work Samuel! I contacted you via Telegram...