go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.
https://go-rod.github.io
MIT License
5k stars 328 forks source link

Resources content is nil #1040

Closed and0x00 closed 2 months ago

and0x00 commented 2 months ago

Rod Version: v0.114.8

Hello again. Guys, I would like insights / example code on how to get the entire list of resources (js and css) directly from the browser and also its content. The idea is to extract all these resources by accessing the page only once and taking advantage of everything that was loaded by the browser.

I have something similar to this:

    var e proto.NetworkResponseReceived
    wait := page.WaitEvent(&e)

    xhrURLs := make([]string, 0)
    go page.EachEvent(func(e *proto.NetworkRequestWillBeSent) {
        if e.DocumentURL != "" {
            if e.Type == proto.NetworkResourceTypeScript {
                bin, _ := page.GetResource(e.Request.URL)
                fmt.Printf("%v\n", bin)
            }
        }
    })()

    page.MustNavigate(URL).MustWaitLoad().MustWaitDOMStable().MustWaitStable()
    wait()

Complete code: https://gist.github.com/and0x00/36f2b6ec9e730fa8a3118cf61db1a3c1

Output

[]
[]
[]
[]
[]
[]
[]

I know it would also be possible using proto.PageGetResourceTree, but I would like a working example as similar as possible to the current version of my code.

github-actions[bot] commented 2 months ago

Please fix the format of your markdown:

23 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
28 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]

generated by check-issue

and0x00 commented 2 months ago

I solved this