go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.
https://go-rod.github.io
MIT License
5.2k stars 343 forks source link

add ability to get videos from test runs #22

Open tmc opened 4 years ago

tmc commented 4 years ago

It'd be nice to also stream these to the monitor page. Perhaps as webp streams?

ysmood commented 4 years ago

Yes, I agree. I tried it on my local, not easy. Most solutions have to use FFmpeg (chrome doesn't support casting well yet, it can only stream png frames), I need to make sure it worth adding an extra heavy dependency.

ysmood commented 4 years ago

It will be great if someone who is familiar with streaming can help me.

tmc commented 4 years ago

I did some research and it looks like this would cause a dependency on libvpx which would make it not-pure-go..

ysmood commented 4 years ago

I usually use Rod on local chrome first, then use the monitor to debug when switching the same code to docker. So I don't have a strong motivation to use it.

But I still think it will be great to have casting supported.

FYI, this is the API: https://chromedevtools.github.io/devtools-protocol/tot/Page/#event-screencastFrame

As you can see, the cast API is still experimental. The best would be chrome team themself to support streaming webp directly so that we don't have to do the heavy lifting. Since chrome itself already have codec lib built-in for web videos.

Well, there's already a ticket for it: https://bugs.chromium.org/p/chromium/issues/detail?id=781117

Here's some example to use FFmpeg:

muhibbudins commented 2 years ago

Hi @ysmood

Thanks in advance for this great project, and then I've come up with a solution to this issue. Of course, in the way you mentioned earlier, using the built-in PageScreencastFrame (experimental) method of chromium itself.

And for everyone who is also figuring out how to make a video of the running process, you can follow what I have created below.

Create a listener for the screencast frame event

// Listen for all events of console output.
frameCount := 0
go page.EachEvent(func(e *proto.PageScreencastFrame) {
    temporaryFilePath := videoDirectory + pageId + "-" + strconv.Itoa(frameCount) + "-frame.jpeg"

    _ = utils.OutputFile(temporaryFilePath, e.Data)

    proto.PageScreencastFrameAck{
        SessionID: e.SessionID,
    }.Call(page)

    frameCount++
})()

Trigger page screencast frame

quality := int(100)
everyNthFrame := int(1)

proto.PageStartScreencast{
    Format:        "jpeg",
    Quality:       &quality,
    EveryNthFrame: &everyNthFrame,
}.Call(page)

Then stop the screencast when the page is closed

proto.PageStopScreencast{}.Call(page)
page.MustClose()

Finally, combine each frame into a viewable video

func HandleRenderVideo(name string, pageId string) (string, string) {
  red := color.New(color.FgRed).SprintFunc()

  slugName := slug.Make(name)
  videoName := slugName + "-" + pageId + ".avi"
  videoPath := videoDirectory + videoName

  go func() {
    renderer, err := mjpeg.New(videoPath, int32(1440), int32(900), 1)

    if err != nil {
      log.Printf(red("[ Engine ] %v\n"), err)
    }

    matches, err := filepath.Glob(videoDirectory + pageId + "-*-frame.jpeg")

    if err != nil {
      log.Printf(red("[ Engine ] %v\n"), err)
    }

    sort.Strings(matches)

    for _, name := range matches {
      data, err := ioutil.ReadFile(name)

      if err != nil {
        log.Printf(red("[ Engine ] %v\n"), err)
      }

      renderer.AddFrame(data)
    }

    renderer.Close()

    for _, name := range matches {
      errRemove := os.Remove(name)

      if errRemove != nil {
        log.Printf(red("[ Engine ] %v\n"), errRemove)
      }
    }
  }()

  return videoName, videoPath
}

You can see the full source code of my project for reference

Thank you

ysmood commented 2 years ago

@muhibbudins It surprised me that it doesn't require libs like FFmpeg, well done!

I wonder if you could spend some time adding this feature to Rod. I checked the standard of MJPEG, seems like chrome supports stream play of it, so maybe we don't have to create temp files, we can stream it directly to a webpage, an example project is here: https://github.com/nsmith5/mjpeg Then we can replace

https://github.com/go-rod/rod/blob/510858c3d9a128798d663fd4f24ba6e806f282c8/dev_helpers.go#L47-L49

with MJPEG

muhibbudins commented 2 years ago

@ysmood Ah, so it can be simpler. I will try to learn about the core of the rod first. Hopefully, I can make a PR for this feature

muhibbudins commented 2 years ago

@ysmood I already made PR #614 614 for this, but sorry in advance, I didn't understand how ServeMonitor works. So, I make this feature work like we use the Screenshot function

ysmood commented 2 years ago

@muhibbudins check this example https://github.com/go-rod/rod/blob/master/lib/examples/launch-managed/main.go

muhibbudins commented 2 years ago

I've seen the code, I mean I haven't caught on why we have to move the video creation process through that serve monitor? whereas we can directly save it into the file.

And is there any reason why we need to live stream to a web page? and how do we see the results of the stream?

ysmood commented 2 years ago

When we launch a browser from a remote machine (remote docker cluster), we usually need a way to debug it, being able to see the page live is what ServeMonitor wants to achieve.

For example, if you find out the remote scraper gets stuck on a page, you can use this ServeMonitor to watch the page and find out the page is not rendered as expected and that's the root cause of the stuck.

ysmood commented 2 years ago

We need to use the video tag in this page to play the MJPEG stream of the page.

muhibbudins commented 2 years ago

I see, so we just need to add one more function besides the one I've created to monitor the streaming process, right?

And we can replace the existing page monitor function which previously used setTimeout to be a video stream to monitor page rendering.

ysmood commented 2 years ago

Yes, should be very easy, and you even don't need dependencies like github.com/icza/mjpeg. Since the protocol of MJPEG is dead simple.

muhibbudins commented 2 years ago

Hmmm but what if the end-user wants to take the result stream into an output file? because the main purpose I made this function is for that, so I can share the process of the headless browser with the user.

Do they need to look from the stream URL on the serve monitor function?

ysmood commented 2 years ago

Create a file with the extension .mjpeg, the binary format is simple, just the concatenation of jpeg files. You don't have to use github.com/icza/mjpeg, you can just use a browser to play the .mjpeg file.

FYI: https://stackoverflow.com/a/1931119/1089063

muhibbudins commented 2 years ago

@ysmood sorry I'm having an issue where not all frames are sent when I run it in the ServeMonitor function, and it causes the JPEG motion not to run, even though I added a delay when switching pages.

func TestMonitor(t *testing.T) {
  g := setup(t)

  b := rod.New().MustConnect()

  b.Context(g.Context()).ServeMonitor("127.0.0.1:3333")

  page := b.MustPage(g.blank()).MustWaitLoad()

  time.Sleep(5 * time.Second)

  page.Navigate("https://github.com")

  time.Sleep(5 * time.Second)

  page.Navigate("https://google.com")
}

And I have tried several ways to decode the data value of the event page screencast frame, but no image is sent to the client.

mux.HandleFunc("/screencast/", func(w http.ResponseWriter, r *http.Request) {
  id := r.URL.Path[strings.LastIndex(r.URL.Path, "/")+1:]
  target := proto.TargetTargetID(id)
  p := b.MustPageFromTargetID(target)

  JPEGQuality := int(90)
  framePerSecond := int(10)

  proto.PageStartScreencast{
    Format:    "jpeg",
    Quality:     &JPEGQuality,
    EveryNthFrame: &framePerSecond,
  }.Call(p)

  flusher, ok := w.(http.Flusher)

  if !ok {
     http.NotFound(w, r)
     return
  }

  w.Header().Add("Content-Type", "multipart/x-mixed-replace; boundary=frame")

  for msg := range p.Event() {
    if msg.Method == "Page.screencastFrame" {
      // first
      w.Write([]byte("\r\n--frame\r\n"))
      w.Write([]byte("Content-Type: image/jpeg\r\nContent-Length: " + strconv.Itoa(len(msg.data)) + "\r\n\r\n"))
      w.Write(msg.data)

      // second
      image, _, _ := image.Decode(bytes.NewReader(msg.data))
      jpeg.Encode(w, image, nil)

      io.WriteString(w, "\r\n")
      flusher.Flush()
    }
  }
})

I've also tried to follow this tutorial but it just turned out to be quite time-consuming to implement with the initial goal, even though it looks quite easy but it doesn't seem to me at the moment.

Thanks

ysmood commented 2 years ago

not all frames are sent

Do you mean the screencastFrame doesn't trigger?

ysmood commented 2 years ago

A related project: https://github.com/navicstein/rod-stream

cc @navicstein

supaplextor commented 1 year ago

I forked an Android adb go package and I think you could take into account adjacent tools like VM videos from virtual box or Android via adb. I'll scrounge some notes up sooner or later.

Also Apple has a open source streaming media server. I haven't dabbled with it in eons so I'm not sure if it would apply here.

Also my fork solved a few glitches but it's still alpha. After I wrap my head around using events like click on last screen shot, do blah blah blah but that's out of scope here. Only that adb can record the minute videos on internal storage then fetch the video.