Ge0rg3 / requests-ip-rotator

A Python library to utilize AWS API Gateway's large IP pool as a proxy to generate pseudo-infinite IPs for web scraping and brute forcing.
https://pypi.org/project/requests-ip-rotator/
GNU General Public License v3.0
1.33k stars 138 forks source link

Scraping amazon not working #50

Closed arpitgoyall closed 1 year ago

arpitgoyall commented 1 year ago
import os
from requests_ip_rotator import ApiGateway
from dotenv import load_dotenv
import requests

load_dotenv()
with ApiGateway(
    "https://www.amazon.in",
    regions=["eu-west-1", "eu-west-2"],
    access_key_id=os.getenv("aws_key_id"),
    access_key_secret=os.getenv("aws_key_secret"),
) as g:
    session = requests.Session()
    session.mount("https://www.amazon.in", g)

    response = session.get("https://www.amazon.in/dp/B09PNHN5ZZ")
    print(response.status_code)
    print(response.request.headers)
    print(response.request.url)

This code is returning the following

503
{'User-Agent': 'python-requests/2.28.1', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive', 'Host': '0h3hjglii9.execute-api.eu-west-1.amazonaws.com', 'X-My-X-Forwarded-For': '43.220.173.104'}
https://0h3hjglii9.execute-api.eu-west-1.amazonaws.com/ProxyStage/dp/B09PNHN5ZZ

I think it is replacing the request URL which is giving the 503 error, what should I do?

MsLolita commented 1 year ago

Did you solved it?

Ge0rg3 commented 1 year ago

Hey @arpitgoyall, thanks for raising the issue. It looks like this is some internal block from Amazon, or some issue with how the requests are coming from their own servers, and not an issue with the library.