Open csm10495 opened 1 year ago
More examples of user agent testing:
In [13]: requests.get('https://thepiratebay.org/search/AEW%20All%20Access%20S01E02/0/3/200', headers={'User-Agent': 'Medusa/1.0.12'})
Out[13]: <Response [403]>
In [14]: requests.get('https://thepiratebay.org/search/AEW%20All%20Access%20S01E02/0/3/200', headers={'User-Agent': 'Medusa2/1.0.12'})
Out[14]: <Response [200]>
In [15]: requests.get('https://thepiratebay.org/search/AEW%20All%20Access%20S01E02/0/3/200', headers={'User-Agent': 'Medusa/1.0.12'})
Out[15]: <Response [403]>
Gee whiz so even after that, it sort of looks like the parsing for thepiratebay is off. It looks like the page loads using javascript, while the parser seems to assume static contents that it can easily parse. Anyone else seeing this?
Using: https://thepiratebay7.com
as an alternative url seems to be working.
It looks like under the hood there is an api that can be called instead of parsing the html:
Like:
https://apibay.org/q.php?q=The+price+is+right
Jackett seems to be using it already: https://github.com/Jackett/Jackett/pull/9593/files
I don't have any problem with the medusa useragent on the piratebay. Maybe you provider is blocking it?
Anyway, it would be a good idea if the Medusa developers changed from screen scraping to using this new api.
I don't think it's my provider since it directly affects a user agent and I think user agents aren't sniffable with https.
Describe the bug When searching via thepiratebay, i get back a 403. It's related to the user-agent string.
To Reproduce Try to search for something via thepiratebay
Expected behavior I'd be able to use thepiratebay as a provider.
Screenshots N/A
Medusa (please complete the following information):
Debug logs (at least 50 lines):
Additional context I used request to check the theory:
Maybe a user-setable user-agent would be helpful here?