Ge0rg3 / requests-ip-rotator

A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.
https://pypi.org/project/requests-ip-rotator/
GNU General Public License v3.0
1.36k stars 140 forks source link

Can no longer scrape google #60

Closed sevmardi closed 1 year ago

sevmardi commented 1 year ago

Can anyone confirm this issue please ?

        gateway = ApiGateway("https://www.google.com",access_key_id='123', access_key_secret='12)
        gateway.start()
        session = requests.Session()
        session.mount("https://www.google.com", gateway)

        response = session.get("http://www.google.com/search?q=barry+bonds&tbm=nws&hl=en&num=10")        
        soup = BeautifulSoup(response.text, "html.parser")
        result_stats = soup.select_one("div#result-stats")
        result_stats = result_stats.get_text().strip().lower()
        print(result_stats)

basically returns None.

Did bunch of other tests, the results are the same. You think google blocked aws ips?

Ge0rg3 commented 1 year ago

Hey @sevmardi, thanks for your issue. I can see that the response.text returns the correct response with the google results. It seems like your HTML parsing is incorrect instead 😅