sefinek24 / Sefinek-Blocklist-Collection

🍒 A comprehensive repository of block lists for Pi-hole and AdGuard, featuring over 100 links and more than 5 million domains on the lists. Feel free to star this repository if you find it useful! o(>ω<)o
https://blocklist.sefinek.net
Other
496 stars 20 forks source link

Server responds with Forbidden 403 Error for Python users #30

Closed hl2guide closed 1 month ago

hl2guide commented 1 month ago

e.g.:

FILENAME = CURRENTWORKINGDIRECTORY + "downloaded_lists\\blocklist" + str(LIST_INDEX) + ".txt"
URL = "https://blocklist.sefinek.net/generated/v1/adguard/abuse/blocklistproject/hosts.fork.txt"
urlretrieve(URL, FILENAME)
sefinek24 commented 1 month ago

I don't see any logs indicating blocked requests. Could you please provide the RayID?

sefinek24 commented 1 month ago

I have just found it now:

Ray ID: 88b3b16c0882aadd
User agent: Python-urllib/3.12
Browser integrity check

Try changing the user agent.

hl2guide commented 1 month ago

Thanks, got it working with:

import requests
url = "https://blocklist.sefinek.net/generated/v1/adguard/abuse/blocklistproject/hosts.fork.txt"
headers = {
    'User-Agent': 'Mozilla 5.0',
}
response = requests.get(url, headers=headers)
content = response.text

if response.status_code == 200:
    with open("output.txt", "w") as file:
        file.write(content)
sefinek24 commented 1 month ago

Great, no problem <:

Anyway, I recommend using user agents like these:

NAME/VERSION (+HOMEPAGE) or Mozilla/5.0 (compatible; NAME/VERSION; +HOMEPAGE)