Closed dd-pardal closed 4 years ago
I've just written a webcrawler using BeautifulSoup that can (right now) only check Twitter. That seems to fix the problem (only for twitter of course, i.e. 'realdonaldtrump' comes back positive and 'amffdsfjvidsvck' comes back negative). I think this is more like a (inefficient) dirty quick fix though, especially since I'm only a beginner Python programmer. It could be extended to other websites, but I don't know if it's really worth it.
@dd-pardal I have dealt with these sites, which are some of the ones you mentioned above: https://trashbox.ru https://ask.fm https://www.house-mixes.com http://insanejournal.com https://www.ifttt.com https://flightradar24.com
i'm getting different false positives.
[+] GPSies: https://www.gpsies.com/mapUser.do?username=GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53
[+] MeetMe: https://www.meetme.com/GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53
[+] OpenCollective: https://opencollective.com/GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53
[+] SportsTracker: https://www.sports-tracker.com/view_profile/GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53
[+] YandexCollection: https://yandex.ru/collections/user/GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53/
[+] boingboing.net: https://bbs.boingboing.net/u/GArbadgershkj3484153sdf35s4f1s5a3df4a6s81f5d3sa486se4f53
TamTam and RubyGem seem to return false positives for strings starting with a Number. For RubyGem: using 58any8random7string4here returns user with profile "58", so the site silently drops everything after the first non-numerical character. If the first character is NOT a numerical character, it does a search on profile name. So basically it gets the profile by ID if the first character is numerical, and searches on profile name otherwise.
I think it is best to remove these sites as I cant find any username rules for these sites: https://easyen.ru/index/8-0-fghfgn.tiojydf https://elwo.ru/index/8-0-fghfgn.tiojydf http://ingvarr.net.ru/index/8-0-fghfgn.tiojydf http://pedsovet.su/index/8-0-fghfgn.tiojydf https://radioskot.ru/index/8-0-fghfgn.tiojydf
Just to clarify, some of these "false positives" may occur because your IP is being flagged as suspicious. If this is happening, capturing the error/captcha page would be helpful to note a fail/error.
There is also a problem with https://forum.redsun.tf/.
I'm just gonna put this list here so that we can keep track of the sites that have been listed in this thread:
Let me know if I have made any mistakes
Some of these are likely occurring due to username format. If you added regex checks not disallow periods in usernames the large majority of these will disappear.
I'm having lots of false positives.
I don't know how you guys check if the user exists or not but when I manually check the found URLs a good portion says "user not found 404".
Here are some examples: https://www.investing.com/ https://opencollective.com/ https://www.tiktok.com/ https://www.wikipedia.org/
tiktok would be a very important one to get fixed....🤔
@rodrigograca31 TikTok was removed a while ago. Are you using an older version of Sherlock? https://github.com/sherlock-project/sherlock/blob/master/removed_sites.md#tiktok
Oh... True... I git cloned the repo 3 months ago... I should update... My bad.
I was about to ask why to remove TikTok but I gave it a trie and seems not easy to figure out if a user exists or not.
EDIT: Actually Im not sure if this will be useful but doing a wget
on an existing user returns a page with JSON that includes metaParams
object/string in the code... (regex could detect that.)
@rodrigograca31 Regex would be nice. That means that I'd have to change the code a little in sherlock.py
. Because at the moment, we are check if the errorMsg
is in r.text
. Instead, we could do a re.findall(REGEX, r.text)
.
I'll try do add that into sherlock.py
and see if everything works properly. But it might be a while before I get started because I'm pretty busy
Polarsteps seem to report false positives most of the time too.
P.S: It always redirects to /user-not-found when the user is a false positive. Maybe it can help in patching this specifically.
@roopeshvs If remember correctly, the checking of the redirect url does not actually work. https://github.com/sherlock-project/sherlock/blob/master/sherlock/sherlock.py#L356
Its been a very long time since I've properly looked at the source code, so Im not entirely sure what is going on. But Im sure if I take look at it when I get some time, I'll get a better understanding of whats going on
@roopeshvs
Polarsteps seem to report false positives most of the time too.
I did some research and I found out that we can use this endpoint check usernames:
https://www.polarsteps.com/validation/unique
With this data
field=users.username
value={{USERNAME}}
The only problem is that it is a POST request, and Sherlock currently does not do POST requests. So I'll have to implement that and a way to tell Sherlock to do a POST request by looking at the data.json
@sdushantha Found a GET API from Polarsteps that would suit us better.
https://api.polarsteps.com/users/byusername/USERNAME
Also, the previously claimed username is a mistake, it didn't exist. :(
Metacritic can fixed by using their API endpoints, but again, I will need to add the ability to do POST requests:
import requests
data = {
'check_username': '1',
'userName': 'username'
}
response = requests.post('https://www.metacritic.com/signup', data=data)
output
{
"viewer": {},
"mixpanelToken": "6e219fd5dbf2cb77082a6cebb50b01a5",
"mixpanelDistinctId": "123.12.314.14",
"omnitureDebug": 0,
"errors": {
"username": "The username you have entered is not available."
}
}
@sdushantha Metacritic is working fine. The only case that is wrong is when we have a dot(.) in the username which is an illegal character for username in the site! Just change Regex and we are good, I guess.
also: freelance.habr
and: tracr.co
I get these false positives: https://500px.com/ https://cash.me/ https://www.clozemaster.com/ https://www.colourlovers.com/ https://www.wikipedia.org/
@GrbavaCigla What version of Sherlock are you using? Some of the sites you mentioned had been removed in the past due to false positives. Also, Sherlock now automatically fetches the site list from GitHub instead of using the local one.
Please try using the latest version of Sherlock and let me know if you still get the false positives you mentioned above.
I can confirm with latest version false positives for: https://www.clozemaster.com/ (uses method status_code but non existing accounts 302 redirects to /dashboard) https://4pda.ru - displays an error for non existing account but sherlock gives me false positive
@enodr I have now fixed the false positive for Closemaster in 87483b5 Regarding 4pda, I'm not getting any false positives:
Regarding 4pda, I'm not getting any false positives:
I figured what the issue is with 4pda: I am landing on an anti-robot page because my IP is flagged for whatever reason on their site. The current rule for 4pda is to match an error message if the account is not found. It would be more reliable to invert the logic and check for a regex only if the account is found.
@enodr
check for a regex only if the account is found.
That would be a great idea, but that would be something we would need to add to Sherlock. I currently dont have much time to work it. But when I do, I'll work on it.
I found a more reliable way for 4pda: https://4pda.ru/forum/index.php?act=auth&action=chkname&login=greenxxx This url returns a json array with 3 elements. If the first element is 0, the username exists, if 1 it does not exist. Can sherlock handle just test condition (check if json and if item[0] == 0) ?
@enodr We can do a simple check for an error message, where the error message is [1,false,0]
. I removed 4pda yesterday, but since we found a solution, we can add it back. I'll do it later do today.
@enodr I have fixed 4pda now in ddecc14
@nohupt Not sure why you are getting false positives. Sherlock seems to give me the correct response:
$ python3 sherlock -l --site "tracr.co" testing123boyeeeee
[*] Checking username testing123boyeeeee on:
[-] tracr.co: Not Found!
$ python3 sherlock -l --site "freelance.habr" testing123boyeeeee
[*] Checking username testing123boyeeeee on:
[-] Freelance.habr: Not Found!
I will be closing this issue because it looks all the sites that have been mentioned in this issue has been dealt with.
False positives for any username
Here's an example output for a "random" username:
GPSies moved its website. CapFriendly is especially weird. It seems to generate random details for nonexistent users. The other ones say that the user doesn't exist.
False positives for usernames with
.
False positives for usernames with
_
Hostnames can't contain underscores, by the way. Aptoide redirects to the homepage.
False positives for usernames with
-
Others
Yandex sometimes redirects to a captcha, originating a false positive.