go-rod / rod

A Chrome DevTools Protocol driver for web automation and scraping.
https://go-rod.github.io
MIT License
5.06k stars 333 forks source link

Trying to get elements from a date input in a google form, but the element cannot be retrieved #1089

Closed hafiihzafarhana closed 4 days ago

hafiihzafarhana commented 1 week ago

Hi, I have a problem about how to retrieve element on HTML with ElementByJS method.

This is my goal? 1 Get title on Google Form (Done) 2 Get describe on Google Form (Done) 3 Get number of pages (✖️) 4 Get number of questions each page and sum them (✖️)

Well, numbers 3 and 4 can be solved if the questions on each page are not mandatory. Example:

Mandatory question image

Not Mandatory question image

So, what is my solution? (Algorithm) 1) I will check which questions on a particular page are mandatory to fill in 2) I will check the question type. Such as paragraph questions, short answers, etc. And fill in the values ​​automatically in each field, if the questions not mandatory to fill, just avoid them 3) Hit "Next button" 4) Looping until the end of page

Before getting to my problem, I've finished autofilling with a few other types of questions

My problem:

My code:

package util

import (
    "fmt"
    "time"

    "github.com/go-rod/rod"
    "github.com/go-rod/rod/lib/proto"
)

func CountPagesWithQuestion(url string) (int, int, error) {
    // launch browser
    browser := rod.New().MustConnect()

    defer browser.MustClose()

    // open the first page
    page := browser.MustPage(url).MustWaitLoad()

    // this is the number of pages and questions each page (init)
    countPages := 1
    totalQuestions := 0

    for {
        fmt.Printf("page %d\n", countPages)

        // wait page loaded
        page.MustWaitLoad()
        time.Sleep(2 * time.Second) 

        // sum the number of questions each page
        questions, err := page.Elements(".Qr7Oae[role=listitem] .geS5n")
        if err != nil {
            return 0, 0, fmt.Errorf("failed to find question: %v", err)
        }
        numQuestions := len(questions)
        fmt.Printf("Sum of this page: %d\n", numQuestions)
        totalQuestions += numQuestions

        // ====================================================================
        // Loop questions each page
        for _, question := range questions {
            // check the question mandatory to fill or not
            isRequired := question.MustHas(".vnumgf")
            if !isRequired {
                fmt.Println("Pertanyaan ini tidak wajib diisi, akan dilewati.")
                continue // if the question not mandatory, just continue
            }

            // ===========================MY PROBLEM===============================
            js := `document.querySelector('input[type="text"][aria-label="Hari"][maxlength="2"][role="combobox"]')`
                inputElement, err := question.ElementByJS(rod.Eval(js))
                if err != nil {
                    fmt.Println("Error element:", err)
                    continue
                }

            fmt.Println(inputElement)
            // ===========================MY PROBLEM===============================

            // ====================================================================
            // THIS IS TYPE OF QUESTION (CLEAR)
            if question.MustHas("input[type='text']") {
                input := question.MustElement("input[type='text']")
                fmt.Println("Fill text")
                input.MustInput("1000") 

            } else if question.MustHas("textarea[required]") {
                textarea := question.MustElement("textarea[required]")
                fmt.Println("Fill textarea")
                textarea.MustInput("Jawaban otomatis pada textarea")
            } else if question.MustHas("[role='radio']") {
                radioOption := question.MustElement("[role='radio']")
                fmt.Println("Choose the first radio option")
                radioOption.MustClick()
            } else if question.MustHas("[role='checkbox']") {
                checkboxOption := question.MustElement("[role='checkbox']")
                fmt.Println("Choose the first checkbox")
                checkboxOption.MustClick()
            } else if question.MustHas("[role='listbox']") {
                dropdown := question.MustElement("[role='listbox']")
                fmt.Println("Open dropdown")
                dropdown.MustScrollIntoView()
                dropdown.MustClick()

                page.MustWaitLoad()
                time.Sleep(1 * time.Second)

                option := page.MustElement("[role='option'][aria-selected='false']")
                fmt.Println("Choose the first dropdown")
                option.MustScrollIntoView()
                option.MustClick()
            } else if question.MustHas("[jscontroller='OZjhxc']") {
                fmt.Println("Fimm time")
                hourInput := question.MustElement("input[aria-label='Jam']")
                minuteInput := question.MustElement("input[aria-label='Menit']")

                hourInput.MustInput("09")
                minuteInput.MustInput("30")
            } else {
                fmt.Println("Element not found")
            }
            // THIS IS TYPE OF QUESTION (CLEAR)
            // ====================================================================

            time.Sleep(1000 * time.Millisecond)
        }

        // ====================================================================

        // Find next button
        nextButton, err := page.Timeout(5 * time.Second).ElementR("div[role=button]", "Berikutnya")
        if err != nil {
            fmt.Println("Next button not found, end scraping.")
            break
        }

        // If the next button is not found just break
        if nextButton == nil {
            fmt.Println("Next button not found")
            break
        }

        // Klik next button
        err = nextButton.Click(proto.InputMouseButtonLeft, 1)
        if err != nil {
            return 0, 0, fmt.Errorf("Fail to hit next button: %v", err)
        }

        // Add count number of pages
        countPages++

        // Wait until next page loaded
        page.MustWaitLoad()
        time.Sleep(2 * time.Second) // Giving time to ensure the page is loaded well
    }

    return countPages, totalQuestions, nil
}

Structure of HTML:

<div class="a7KROc">
          <div class="vEXS5c">
            <div class="UaWVmb">HH</div>
            <div
              class="rFrNMe genAeb yqQS1 toT2u ESbQy zKHdkd"
              jscontroller="pxq3x"
              jsaction="clickonly:KjsqPd; focus:Jt1EX; blur:fpfTEe; input:Lg5SV"
              jsshadow=""
              jsname="cASped"
            >
              <div class="aCsJod oJeWuf">
                <div class="aXBtI Wic03c">
                  <div class="Xb9hP">
                    <input
                      type="text"
                      class="whsOnd zHQkBf"
                      jsname="YPqjbf"
                      autocomplete="off"
                      tabindex="0"
                      aria-label="Hari"
                      maxlength="2"
                      aria-disabled="false"
                      min="1"
                      max="31"
                      role="combobox"
                      data-initial-value=""
                    />
                  </div>
                  <div class="i9lrp mIZh1c"></div>
                  <div jsname="XmnwAc" class="OabDMe cXrdqd"></div>
                </div>
              </div>
              <div class="LXRPh">
                <div jsname="ty6ygf" class="ovnfwe Is7Fhb"></div>
              </div>
            </div>
          </div>
        </div>

I want to get this element:

<input
                      type="text"
                      class="whsOnd zHQkBf"
                      jsname="YPqjbf"
                      autocomplete="off"
                      tabindex="0"
                      aria-label="Hari"
                      maxlength="2"
                      aria-disabled="false"
                      min="1"
                      max="31"
                      role="combobox"
                      data-initial-value=""
                    />

But, i cant find that element.

My google form link: https://docs.google.com/forms/d/e/1FAIpQLSdsd0M1s2XddKguecmugBFawZOEaRBxZCGsgTrkY9NFtJ2wuA/viewform

github-actions[bot] commented 1 week ago

Please add a valid Rod Version: v0.0.0 to your issue. Current version is v0.116.1

Please fix the format of your markdown:

19 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "1) I will check which question..."]
24 MD036/no-emphasis-as-heading/no-emphasis-as-header Emphasis used instead of a heading [Context: "Before getting to my problem, ..."]
27 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "- I can't retrieve element of ..."]
31 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
31 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
174 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
174 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
216 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
216 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
234 MD032/blanks-around-lists Lists should be surrounded by blank lines [Context: "- I was debug the element on w..."]
235 MD031/blanks-around-fences Fenced code blocks should be surrounded by blank lines [Context: "```"]
235 MD040/fenced-code-language Fenced code blocks should have a language specified [Context: "```"]
240:21 MD009/no-trailing-spaces Trailing spaces [Expected: 0 or 2; Actual: 1]

generated by check-issue