adoptium / api.adoptium.net

Adoptium API 🚀
https://api.adoptium.net
Apache License 2.0
33 stars 26 forks source link

Python requests banned? #580

Open andre4ik3 opened 1 year ago

andre4ik3 commented 1 year ago

Describe the bug I am building a Python app and part of it interfaces with the Adoptium API. Using the Requests library, which sends user-agent python-requests/2.31.0 by default, gives a 403 forbidden error. Not sending a user-agent at all also causes a 403 forbidden. Setting the user-agent to something else fixes the issue, but it shouldn't be a necessary workaround, any user-agent should be allowed.

To Reproduce

  1. pip install requests
  2. Write a file like this:
import requests
resp = requests.get("https://api.adoptium.net/v3/info/available_releases")
print(resp.request.headers)
print(resp.headers)
print(resp.status_code)
print(resp.text)
  1. Run it. Observe 403 error (e.g. first screenshot)
  2. Change second line of file to be like this:
resp = requests.get("https://api.adoptium.net/v3/info/available_releases", headers={"User-Agent": "Dummy"})
  1. Run it again. Observe it works. Wow!

Expected behavior It works with the default requests config

Screenshots Doesn't work with the regular header.

CleanShot 2023-05-30 at 17 43 34@2x

Works when you put some random one. Crazy stuff, one http header is the entire difference between failure and success.

CleanShot 2023-05-30 at 17 44 54@2x

Device (please complete the following information):

Additional context Looks to be an azure misconfiguration. Also this happened before but it wasn't fixed properly.

gdams commented 1 year ago

@johnoliver any ideas here? I can confirm that I'm also seeing the same error when using the above test file

netomi commented 12 months ago

I wonder if its related to the hosting provider, see a similar SO question about cloudflare:

https://stackoverflow.com/questions/74446830/how-to-fix-403-forbidden-errors-with-python-requests-even-with-user-agent-head

This sounds like some protection mechanism against data scraping.

netomi commented 10 months ago

I tested the above snippet and can confirm that I also receive a 403 error.

However, by simply adding a User-Agent header like that:

    import requests

    headers = {
        "User-Agent": "My User Agent 1.0",
    }

    resp = requests.get(
        "https://api.adoptium.net/v3/info/available_releases", headers=headers
    )
    print(resp.request.headers)
    print(resp.headers)
    print(resp.status_code)
    print(resp.text)

the request passes:

{'User-Agent': 'My User Agent 1.0', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
{'Date': 'Sat, 21 Oct 2023 20:02:31 GMT', 'Content-Type': 'application/json;charset=UTF-8', 'Content-Length': '344', 'Connection': 'keep-alive', 'Strict-Transport-Security': 'max-age=63072000; includeSubDomains; preload', 'X-Content-Type-Options': 'nosniff', 'X-Frame-Options': 'DENY', 'X-Pod-Hostname': 'frontend-service-89db8c5b7-tvvxp'}
200
{ 
   "available_lts_releases": [
        8,
        11,
        17,
        21
    ],
...