Closed ekrekeler closed 2 years ago
wow, I never thought that someone would actually use selescrape
I tested and it does indeed return a streamtape link
its just that streamtape doesnt work as an extractor
So the streamtape extractor doesn't work sometimes because the server doesn't like the ancient user agent presented by the get helper function. I overrided the user agent in streamtape.py to a more recent one, and haven't seen the 503 error since.
Do you think it makes sense to define a user agent in streamtape.py, or should this be a global setting? Why is the user agent so old anyways? Using python 3.9.5 on WSL2, this is the user agent shown in debug:
Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2225.0 Safari/537.36
Edit: I debugged this further and there is something wonky going on when the helper makes the request. For the default_headers variable, I can see two defined headers for user-agent.
pp.pprint(default_headers)
{ 'User-Agent': "{'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X "
'10_7_3) AppleWebKit/535.11 (KHTML, like Gecko) '
"Chrome/17.0.963.66 Safari/535.11'}",
'user-agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/536.3 (KHTML, like '
'Gecko) Chrome/19.0.1061.1 Safari/536.3'}
Only one header should be defined for user-agent, but here we have two. And this is using the default configuration.
woah Edit: Yeah, it was because of selescrape (somehow I made this simple mistake) https://github.com/anime-dl/anime-downloader/blob/5d63f2b513bf39229a527ca6749e5854aced610d/anime_downloader/sites/helpers/selescrape.py#L109
To me defense, this is an issue I had already fixed in an older PR, but it remained not merged https://github.com/anime-dl/anime-downloader/pull/503
wait no, im dumb, you werent talking about that
Yeah, it was because of selescrape
I'm not sure it is, because there is no sel=True
in extractors/streamtape.py. So it shouldn't be using selescrape for this request.
wait no, im dumb, you werent talking about that
Yeah I just want to know if I should be defining a different header in streamtape.py. I've had instances where I get HTTP/503 when I removed the duplicate user-agent from default_headers. So this is a separate issue. There are certain user-agents in that random list that the streamtape server doesn't like. So I can either:
Why I am asking is I don't know is if that list of user agents has a reason for being so out-of-date compared to current user agents, and if changing that list will break (or fix) other sites.
Why I am asking is I don't know is if that list of user agents has a reason for being so out-of-date compared to current user agents, and if changing that list will break (or fix) other sites.
You can go ahead and update that list if you want to. As far as I know that list has not been updated for a long time.
Okay I've updated all the user agents I could find in the code. Tested using anime test
and I saw no differences before and after modifying the user agents.
I also tweaked the decodeString method in nineanime.py to remove some extra characters from the URL that shouldn't be there. I have no idea how the encoding for the 9anime API works so I'm just using character matching after the string is decoded instead.
Couple of things to note for next steps:
There is someone else working on an alternate pull request for 9anime, #682 .
It seems that selenium may not be needed to get 9anime working if cloudflare is bypassed. I did not realize this. I will keep this open for the time being, it seems there are still some things that need to be worked out for that pull request. But avoiding the requirement of selenium is preferable.
So uhh... According to @justfoolingaround 9anime is changing its protection on a daily basis. So if this PR is ready I'll merge.
Oh, it doesn't work. They have a "verified" url parameter now...
Closes #599
For now, same as before, only tested using Streamtape server.
Please suggest changes as needed.