Closed l3str4nge closed 4 years ago
Hi Mateusz,
That's exactly what I had in mind, searching for usernames that match the keywords from sources like:
1) LinkedIn (social_media/linkedin.py) 2) Instagram (social_media/instagram.py) 3) Facebook (social_media/facebook.py)
There are python libs to interact with these platforms. I have started working on Instagram, do you want to pick on any of the other two?
Sure!
Shout if you need anything
Hi Mateusz,
Social networks are well prepared for anti-scraping, therefore extensive searches might be blocked and blacklisted but also not effective.
For this, I'm thinking about applying the "Levenshtein Automaton". Meaning, given a word/keyword "w" and a distance "n", it generates all word permutations within that distance and saves to a list.
The openSquat will then check for each of the keywords of the list if there is an associated account, such as:
https://facebook.com/keyword https://linkedin.com/in/keyword https://instagram.com/keyword
Yeah, for me it's good idea. We need to create mechannism for automaton then create abstraction for social media for easily adding new platform in the future for example twitter.
Already i am during investigation about facebook lib for python.
If you already started working on Instagram i could implement automaton on prepare abstraction for social media.
Hello!
I’ve not done much on IG, I was doing more exploration work and see how can I effectively check if a user exists or not and I think I should be able to achieve this by today.
I’ve done reading about the Leven Automaton but I actually haven’t started coding. Finding a Python-lib will probably be the fastest way to get this going.
But for long strings (e.g Facebook) getting all the permutations even on a distance of 1 edit might be too exhausting, therefore to keep the algorithm simple and without being blocked by social networks, I’m thinking of only do permutations with the vowels (a e i o u) and they usually are more susceptible to being exploited by fraudsters.
I’m still thinking how to solve it.
I want to make openSquat doesn't violate any service policy:
"We prohibit crawling, scraping, caching or otherwise accessing any content on the Service via automated means, including but not limited to, user profiles and photos (except as may be the result of standard search engine protocols or technologies used by a search engine with Instagram's express consent)."
In this particular case, I don't think we are doing "automated means", meaning the user will always have to manually conduct the queries.
@mateuszz0000
I have not been coding as I've been sick (serious shoulder problem), however, I'm back! My Instagram code does not work anymore (website changes) so thankfully I have not pushed the changes to master.
Hello, I have question about roadmap issue regarding social media squatting. I could work on this but I need some tips for that :) How can we detect social media squatting? The only thing that comes to my mind is somehow detect usernames just like domains.