Closed 00-kat closed 6 months ago
@cd-CreepArghhh Can you share the raw html used for that page? I'll likely be able to add it to #2068
It won't bypass the captcha until circumvention is added, but it would avoid F+ hits due to the captcha when it's presented
Huh, interestingly there's no captcha now (so it's not a JS issue) but there's a 404 page and a profile. Maybe I'll run Sherlock a couple times then try again.
If you do end up hitting it again drop a ping
Testing yandex in a PITA on my end having to use vpns and such, and even when I do, it apparently trusts me implicitly and refuses to rate limit or captcha me
(if the captcha page returns a status code other than 200, we can also use that as a simpler resolution)
Okay, found out that spamming them with requests gets you a captcha fast. Running Sherlock 4 times resulted in one captcha, and my browser got 2 in 6 requests.
You're going to have to run the HTML through some prettifier though (I don't know any) since it's all on one line.
Note: Github won't let me upload .html files, so rename the .txt to a .html, thanks.
Oops, Captcha!.txt Oops, Captcha!_files.zip
I'll spam a few requests with python now to check the status code.
Edit: the captcha page (some long URL with a hash or Base64 string in it) returns 200, I'll see what I get when redirected from the profile page (probably 200, so don't wait for me to finish).
Finished. Out of 100 requests, the first request was a 404 (i.e. no captcha) then the rest were all 200s (thus captcha). No 302s either I think, since IIRC requests doesn't automatically resolve those. Status code isn't going to be of any use.
Gonna push a hopeful fix. If you want to be added as a co-author you can drop your github no-reply email/other github email here and a name. Or link to somewhere that has it.
Otherwise I'll push as a single committer.
Just push as single committer
Done. Seems to have not broken anything on my end -- can you pull and validate all 3 cases as well
(captcha, valid, not valid)
Just realized I forgot a case --- 'not valid in country'. Will add that now. Shouldn't make a difference for the captcha tests.
Edit::: that's actually accounted for by the 404 msg I added, so we're good
I don't think it worked, since there's still a false-positive. By the way, I'm pretty sure I'm still in the blacklist or whatever Yandex Music has going on, so it will be a while before I can test the other two cases.
$ git clone https://github.com/ppfeister/sherlock.git # hope I cloned the right repo...
$ cd sherlock
$ python sherlock ecfhlmiuewfimcuhem --site YandexMusic
[*] Checking username ecfhlmiuewfimcuhem on:
[+] YandexMusic: https://music.yandex/users/ecfhlmiuewfimcuhem/playlists
[*] Search completed with 1 results
hm......... lemme re eval and get back
@cd-CreepArghhh Just got back
Noticed that you didn't run with the --local
flag. When you don't use this flag, it pulls from the repo by default instead of our local patched data.json. Can you test one more time but while using that flag? (this won't be necessary if the patch gets merged upstream)
When using that flag on my end, it seems to give the expected result for each of the four cases (not valid, valid, captcha, geoblock).
(that flag messes with me quite a bit.....)
Edit: you do not need to re-pull unless it's been deleted
Yay, it works! ecfhlmiuewfimcuhem
doesn't show up, ya.playlist
does, and I didn't get any false positives even after spamming the command 30+ times. I didn't realise that it grabbed a data.json
from GitHub instead of the local one by default (probably so you don't need to git pull
as often).
Also, I'm not sure what the geoblock case is so I can't really test that. (I assume I could try running it through a bunch of tor nodes until I hit it, but I don't have time for that right now).
I get geoblocked here in the USA, so it was an easy test for me to run, lol
I'll go ahead and link your Issue to that PR so it gets closed when and if it (hopefully) gets merged
Checklist
Description
Here's a random username that can't possibly exist: ecfhlmiuewfimcuhem.
Here's the username from data.json: ya.playlist
When I visit either, I get a captcha (note: JS is disabled in my browser):
Unless Sherlock uses Selenium/Pyppeteer, which i highly doubt (it's not in requirements.txt), this captcha isn't really avoidable (I think). Maybe it even shows up with JS enabled, which I didn't check.
I'm not opening a PR removing YandexMusic because it could be an issue that only happens for me, or maybe it's possible to bypass this captcha.