MishMash hackathon is India’s largest online diversity hackathon. The focus will be to give you, regardless of your background, gender, sexual orientation, ethnicity, age, skill sets and viewpoints, an opportunity to showcase your talent. The Hackathon is Live from 6:00 PM, 23rd March to 11:55 PM, 1st April, 2020
2
stars
12
forks
source link
SCH3M3_SH3LL - Social Scraper - Machine Intelligence / Social Impact #58
The proposed idea intends to identify child predators/cyber harassers in social media with malicious intent.
The tool detects suspect profiles based on child grooming behavior patterns/cyber harassers on the social media platforms ​ manually ​ which may lead to drain out of time and resources. To resolve this, a new automated system is employed to identify cyber predators/offenders using ​ machine intelligence​ .
This system is capable of analyzing all social media platforms like Instagram, Twitter, Facebook, LinkedIn, etc., and other outlets seeking the same suspect. If the suspect doesn’t have the same user ID on different platforms, then Reverse Image Searching is done to identify the suspect.
A set of user_id is used as a key to grab their personal information and their post information(Post ID, Comments, Timestamp, location, Captions) from multiple social platforms using ​ OSINT(Open Source INTelligence) and Beautifulsoup Python Package. The above data of various posts are subjected to analyze malevolent contents using Machine Learning and Pandas Python library.
Based on the statistical analysis, suspects are categorized based on their behavior(also Polite harassment). The users whose suspect level is greater than the threshold value will be scrutinized and monitored for further analysis. The suspected user’s post information(media like Image, Audio, and Video) is retrieved and analyzed using the ​ IGPL Python package, ​ Urllib, and ​ Artificial
Intelligence with ​ NSFW (Not Safe For Work) library to make them fall under the category suspects/predators.
Finally, the Child grooming patterns followers and statistical results that are generated are analyzed and the concerned person is classified as predator and reported to the law enforcement authorities
🔦 Any other specific thing you want to highlight?
Web Crawler (Scrapering all Post details from Various Outlet)
CLI (Command Line Injection - Easy to use)
Simultaneous Scrapering (Various platforms at the same time)
Image Recognition (Scrapers media from account and figures out)
Location (Post where it was taken and posted information)
Hashtags and Mentions (Hashtags and account mentions)
NSFW (Checks whether the post is Not Safe For Work)
Efficient (Gathers all possible accounts in various platforms)
Diverse (Has the capability of handling more than one account at a time)
AutoMail (Final Suspect Reports are AutoMailed)
✅ Checklist
Before you post the issue:
[ :white_check_mark: ] You have followed the issue title format.
[ :white_check_mark: ] You have mentioned the correct labels.
[ :white_check_mark: ] You have provided all the information correctly.
SCH3M3_SH3LL - Social Scraper - Machine Intelligence / Social Impact
Project information
Theme: Machine Learning / Social Impact
Project Name: ## Social Scraper
Short Project Description: Social Scraper is a python tool meant for Detection of Child Predators/Cyber Harassers on Social Media
Team Name: SCH3M3_SH3LL
Team Members:
Demo Link:
Repository Link(s):
Presentation Link:
Raw File:
🔥 Your Pitch
The proposed idea intends to identify child predators/cyber harassers in social media with malicious intent.
The tool detects suspect profiles based on child grooming behavior patterns/cyber harassers on the social media platforms ​ manually ​ which may lead to drain out of time and resources. To resolve this, a new automated system is employed to identify cyber predators/offenders using ​ machine intelligence​ .
This system is capable of analyzing all social media platforms like Instagram, Twitter, Facebook, LinkedIn, etc., and other outlets seeking the same suspect. If the suspect doesn’t have the same user ID on different platforms, then Reverse Image Searching is done to identify the suspect. A set of user_id is used as a key to grab their personal information and their post information(Post ID, Comments, Timestamp, location, Captions) from multiple social platforms using ​ OSINT(Open Source INTelligence) and Beautifulsoup Python Package. The above data of various posts are subjected to analyze malevolent contents using Machine Learning and Pandas Python library. Based on the statistical analysis, suspects are categorized based on their behavior(also Polite harassment). The users whose suspect level is greater than the threshold value will be scrutinized and monitored for further analysis. The suspected user’s post information(media like Image, Audio, and Video) is retrieved and analyzed using the ​ IGPL Python package, ​ Urllib, and ​ Artificial Intelligence with ​ NSFW (Not Safe For Work) library to make them fall under the category suspects/predators.
Finally, the Child grooming patterns followers and statistical results that are generated are analyzed and the concerned person is classified as predator and reported to the law enforcement authorities
🔦 Any other specific thing you want to highlight?
✅ Checklist
Before you post the issue: