go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.
https://go-rod.github.io
MIT License
5.19k stars 341 forks source link

How to close page or browser properly? #1095

Open hafiihzafarhana opened 1 month ago

hafiihzafarhana commented 1 month ago

Rod Version: v0.116.1

I have a problem about how to close page or browser, i think.

So this is the scenario: 1) I run my program code for the first time (For the first time it was a success) 2) I tried a second time with other url (My program doesn't error, but it stuck)

Am I wrong about how to close the page and browser after the first running program is run?

This is my code: service.go

func (server *Server) surveyScrapper(ctx *gin.Context) {
    var req surveyScrapperRequest
    if err := ctx.ShouldBindJSON(&req); err != nil {
        ctx.JSON(http.StatusBadRequest, errorResponse(err))
        return
    }

    title, desc, totalPages, totalQuestion, _ := util.CountPagesWithQuestion(req.Url)
    rsp := surveyScrapperResponse{
        Title: title,
        Description: desc,
        TotalPages: int16(totalPages),
        TotalQuestion: int16(totalQuestion),
    }

    ctx.JSON(200, rsp)
}

util.go

func CountPagesWithQuestion(urls string) (string, string, int, int, error) {
    fmt.Println("Start") // for the second running, my code stuck in there

    // this is how i setup
    url := launcher.New().Headless(true).Bin("/usr/bin/chromium-browser").Devtools(false).NoSandbox(true)
    defer url.Cleanup()
    url2 := url.MustLaunch()
    browser := rod.New().ControlURL(url2).NoDefaultDevice().Trace(true).SlowMotion(1 * time.Second).MustConnect()

    // close the browser
    defer browser.MustClose()

     // load my browser    
    page := browser.MustPage(urls).MustWaitLoad()

     // check a current url, if the current url not proper it will return error
    currentURL := page.MustInfo().URL
    if(currentURL != urls) {
        return "", "", 0, 0, fmt.Errorf("failed to get text content of element")
    }

     // load my page
    timeout := 5 * time.Second
    page.MustWaitLoad()
    currentURL = page.MustInfo().URL

     // find element to get the value and set with proper value, if none it will set to " "
    elem1, _ := page.Timeout(timeout).Element(".F9yp7e.ikZYwf.LgNcQe")
    var title string
    if elem1 == nil {
        title = ""
    } else {
        var err error
        title, err = elem1.Text()
        if err != nil {
            title = ""
        }
    }

     // this is just my code, but still not use
    countPages := 1
    totalQuestions := 0

    // return 
   return title, desc, countPages, totalQuestions, nil
}
hafiihzafarhana commented 1 month ago

I think there was an error in closing the browser. So if I change it to another URL, it causes a stuck

I use thunder client, WSL

I'm processing data on Google Form to get title, description, number of pages, and number of questions.

But for now I am focusing on this stuck problem

github-actions[bot] commented 1 month ago

Please fix the format of your markdown:

6 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "1) I run my program code for t..."]
13 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```go"]
34 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```go"]

generated by check-issue

hafiihzafarhana commented 1 month ago

Update: I tried using Windows, there was no stuck problem. However, when using Linux a stuck occurs

hafiihzafarhana commented 1 month ago

This is my Dockerfile

# Build stage
FROM golang:1.21.11-alpine3.19 AS builder
WORKDIR /app
COPY . .
RUN go build -o main main.go
RUN apk add curl

# Run stage
FROM alpine:3.14
RUN apk add --no-cache chromium
WORKDIR /app
COPY --from=builder /app/main .
COPY --from=builder /app/migrate.linux-amd64 ./migrate
COPY app.env .
COPY start.sh .
COPY wait-for.sh .
COPY db/migration ./migration

EXPOSE 8080
CMD [ "/app/main" ]
ENTRYPOINT [ "/app/start.sh" ]
ysmood commented 1 month ago

Rod is not tested on Alpine. The CI of rod uses ubuntu. Better to use rod=trace env var to log the details to see where it gets stuck.

hafiihzafarhana commented 1 month ago

But I have no error using Alpine. So, should we using ubuntu or debian?

This is the condition if using alpine:

errorris

Idk, what is the problem.

And sometimes I use the same scenarios and steps. it works properly, sometimes it also gets stuck. I have no idea for my issue. Does this happen because my code is not perfect in closing or starting the process?

ysmood commented 1 month ago

Doesn't look like a rod issue.