omkarcloud / botasaurus

The All in One Framework to build Awesome Scrapers.
https://www.omkar.cloud/botasaurus/
MIT License
1.47k stars 136 forks source link

examples: captcha on every request to g2 #189

Open tobwen opened 2 months ago

tobwen commented 2 months ago

summary

Seems like the examples are outdated... Every example with g2.com lead to a captcha:

image

code used

from botasaurus.browser import browser, Driver

@browser
def scrape_heading_task(driver: Driver, data):
    driver.google_get("https://www.g2.com/products/github/reviews.html?page=5&product_id=github", bypass_cloudflare=True)
    driver.prompt()
    heading = driver.get_text('.product-head__title [itemprop="name"]')
    return heading

scrape_heading_task()

versions

At the time of writing, I'm on latest revision of the library and Chrome on Debian Linux (Python 3.11).

addition information

IP is not blocked. "Normal" browsers (Firefox, LibreWolf, Chrome directly started) work.

aandrewmolt commented 2 months ago

Same issue for me