Why does this site not work with the library?

la55u commented 6 years ago

I'm using a free API that has a daily download limit. The endpoint is for example: https://www.srrdb.com/download/file/Far.Cry.DVDRip-MYTH/myth.nfo It should download a simple text file, but the daily limit is around 6-700 requests. I've tested it to reach the limit using this code:

import requests
from time import sleep

url = 'https://www.srrdb.com/download/file/Far.Cry.DVDRip-MYTH/myth.nfo'

i = 1
while True:
    r = requests.get(url)
    print('#{} {}'.format(i, r.status_code))
    if r.status_code != 200:
        print(r.text)
        exit()
    i += 1
    sleep(1)

After that I tried to use this library but despite it says in the logs that it using different proxies for each requests, it always returns 429 status code.

import time
from http_request_randomizer.requests.proxy.requestProxy import RequestProxy

if __name__ == '__main__':
    req_proxy = RequestProxy()

    test_url = 'https://www.srrdb.com/download/file/Far.Cry.DVDRip-MYTH/myth.nfo'
    status = 0
    while status != 200:
        request = req_proxy.generate_proxied_request(test_url)
        status = request.status_code
        if status == 200:
            print('OK')
        time.sleep(2)

What am I missing here? Thank you

pgaref commented 6 years ago

The main reason is that you are trying to access an https site and the proxies available are for http. Please check the example below with plain requests to understand the difference:

test_url = 'http://www.srrdb.com/download/file/Far.Cry.DVDRip-MYTH/myth.nfo'
status = 0

proxies = {
    "http": "195.239.63.150:8080",
    "https": "195.239.63.150:8080",
}

while status != 200:
    request = requests.get(test_url, proxies=proxies)
    status = request.status_code
    if status == 200:
        print('OK')
        with open('tmp.dat', 'wb') as f:
            f.write(request.content)
    else:
        print("not ok")
    time.sleep(2)`

pgaref commented 6 years ago

Hopefully sorted by #47

pgaref / HTTP_Request_Randomizer

Why does this site not work with the library? #45