austinoboyle / scrape-linkedin-selenium

`scrape_linkedin` is a python package that allows you to scrape personal LinkedIn profiles & company pages - turning the data into structured json.
MIT License
447 stars 162 forks source link

Certifications return empty [] #95

Open anntdiv opened 3 years ago

anntdiv commented 3 years ago

Hello, Thanks for new update in personal_info section, I found out that the attribute 'certifications' return empty list [] Test url: https://www.linkedin.com/in/an-nguyen-9b3248122/ Results: {'personal_info': {'name': 'An Nguyen', 'headline': 'Data Scientist/Machine Learning Engineer', 'company': 'PERSOL PROCESS & TECHNOLOGY CO., LTD.', 'school': 'National Chiao Tung University', 'location': 'Vietnam', 'summary': '6 years working in various fields: AI, Machine Learning, Data Science, Web, Backend. Interested in science fiction and technology review.', 'image': 'https://media-exp1.licdn.com/dms/image/C5603AQEUAnRKaH7ryg/profile-displayphoto-shrink_400_400/0/1542605734842?e=1628726400&v=beta&t=bMiScpuXsmzGK_sveiEzVA0lK5SweSli2ktvsYy_eR0', 'followers': '1,612', 'email': 'an.thanh.nguyen.vn@gmail.com', 'phone': None, 'connected': None, 'websites': []}, 'experiences': {'jobs': [{'title': 'AI Technical Lead', 'company': 'PERSOL PROCESS & TECHNOLOGY CO., LTD.\n Full-time', 'date_range': 'Jul 2019 – Aug 2020', 'location': 'Ly Chin Thang, District 3', 'description': '- Face Recognition for Work Check-in- Profile and Job Recommendation for employees in Job Search Industry - Computer Vision tasks:+ Gender (male or female) Classification+ Age Prediction via photo+ Face Detection in image+ Big Five Personality Prediction via selfie photo (Openness - Conscientiousness -Extraversion - Agreeableness - Neuroticism)- Applicant Profile Ranking from textual similarity to recorded profiles- Automatic Scoring Applicant profile via self-introduction video by speech-to-text, speech synthesis, face micro-gestures, topics extracted in textual speech content.Use AWS (Amazon Web Service) SageMaker for Data Analysis and Deep Learning, Hand-on and Build NVIDIA GPU 2080Ti server Ubuntu 18.0 for Deep Learning projects\n \n\n\n\n see less', 'li_company_url': 'https://www.linkedin.com/company/persol-process-technology/'}, {'title': 'Machine Learning Engineer', 'company': 'Mainspring Technology', 'date_range': 'Jun 2018 – Sep 2018', 'location': 'Jakarta, Indonesia', 'description': '- NLP system for categorizing news into groups like sport/nature/entertainment...- Employ Computer Vision in detect breast cancer in X-ray images', 'li_company_url': 'https://www.linkedin.com/company/mainspring-technology/'}, {'title': 'Senior Data Scientist', 'company': 'VNG Corporation', 'date_range': 'Dec 2017 – Mar 2018', 'location': 'Ho Chi Minh City', 'description': '- Object Detection: Model YOLOv2 (You Only Look Once) detecting people/car/table... in images/videos- Video Analysis: CV for classifying videos', 'li_company_url': 'https://www.linkedin.com/company/vng-corporation/'}, {'title': 'Machine Learning Freelancer', 'company': 'Freelance', 'date_range': 'Aug 2017 – Nov 2017', 'location': 'Ho Chi Minh City, Vietnam', 'description': '- Meaning Search: Search a service by meaning of input instead of keywords- Sentiment Classification: Classify customer reviews into Negative/Neutral/Positive. Develop at a Social Listening agency.', 'li_company_url': ''}, {'title': 'Data Scientist', 'company': 'Fpt Telecom', 'date_range': 'May 2016 – Mar 2017', 'location': 'Ho Chi Minh City, Vietnam', 'description': '- Churn Prediction: Predict whether Internet customers continue to subcribe next month.- Cyber-security: Machine Learning system to detect potentially malicious websites/domains. Data points is huge, growing both in size and complexity, especially in the era of IoT. Investigate graph-analytics, graph-database, distributed computation.', 'li_company_url': 'https://www.linkedin.com/company/fpt-telecom-hcm/'}, {'title': 'Programmer', 'company': 'CARDANO Lab', 'date_range': 'Apr 2016 – May 2016', 'location': 'Vietnam', 'description': '- R&D in Blockchain- RESTFul API supporting crypto wallet of Bitcoin, Litecoin, Dogecoin, Etherum', 'li_company_url': 'https://www.linkedin.com/company/cardanolab/'}, {'title': 'Graduate Student', 'company': 'National Chiao Tung University', 'date_range': 'Feb 2015 – Jan 2016', 'location': 'Hsinchu City, Taiwan', 'description': '- Daya Bay Project: a Data Analysis project for Neutrinos research. Data at the order of Terabytes from distant stars, nuclear plant,... are collected by underground sensors and shared via worldwide collaborations. Use C++/ROOT, run on NERSC.gov', 'li_company_url': 'https://www.linkedin.com/company/national-chiao-tung-university/'}, {'title': 'Software Engineer', 'company': 'TMA Solutions', 'date_range': 'Sep 2015 – Dec 2015', 'location': 'Ho Chi Minh City', 'description': '- Develop/maintain a web application to manage profile, account balance,... for shareholders. Participate in both Back-End and Front-End to integrate new dashboards, fix bugs, maintain system.', 'li_company_url': 'https://www.linkedin.com/company/tma-solutions/'}, {'title': 'Research Engineer', 'company': 'DFM Engineering', 'date_range': 'May 2013 – Apr 2014', 'location': 'Ho Chi Minh City, Vietnam', 'description': '- Develop/maintain softwares in Aero-dynamics simulating flow of air around 3D objects which can be used to design car, airplane, rocket, etc... Research papers, implement algorithms, optimize performance.', 'li_company_url': 'https://www.linkedin.com/company/dfm-engineering/'}], 'education': [{'name': 'National Chiao Tung University', 'degree': 'PhD Dropout', 'grades': None, 'field_of_study': 'Theoretical and Mathematical Physics', 'date_range': '2015 – 2016', 'activities': None}, {'name': 'Ho Chi Minh City University of Sciences', 'degree': 'Bachelor of Science', 'grades': None, 'field_of_study': 'Theoretical and Mathematical Physics', 'date_range': '2008 – 2012', 'activities': None}], 'volunteering': []}, 'skills': [{'name': 'Machine Learning', 'endorsements': '4'}, {'name': 'R&D', 'endorsements': '2'}, {'name': 'Full-Stack Development', 'endorsements': '2'}, {'name': 'Computer Vision', 'endorsements': 0}, {'name': 'Cybersecurity', 'endorsements': 0}, {'name': 'Cloud Computing', 'endorsements': 0}, {'name': 'Python', 'endorsements': 0}, {'name': 'JavaScript', 'endorsements': 0}, {'name': 'Mathematica', 'endorsements': 0}, {'name': 'Graph-Analytics', 'endorsements': 0}, {'name': 'Cryptocurrency', 'endorsements': 0}], 'accomplishments': {'publications': ['DNS graph mining for malicious domain detection. Hau Tran, An Nguyen, Phuong Vo, Tu Vu. IEEE 2017.'], 'certifications': [], 'patents': [], 'courses': [], 'projects': [], 'honors': [], 'test_scores': [], 'languages': ['English', 'Vietnamese'], 'organizations': []}, 'interests': ['NASA - National Aeronautics and Space Administration', 'Grab', 'LinkedIn', 'Gojek', 'Airy', 'Diversely.io'], 'recommendations': {'received': [], 'given': []}}

austinoboyle commented 3 years ago

Hi, I unfortunately do not have time to fix every new issue/change in LinkedIn's css structure.

You are welcome to submit a pull request that targets the new structure for certifications, which breaks it out into a different section than the rest of "Accomplishments". The relevant code is in in Profile.py: https://github.com/austinoboyle/scrape-linkedin-selenium/blob/e2b86433cc684d49bf9443b3cc5ae0037551abbb/scrape_linkedin/Profile.py#L172-L179

sahuel commented 2 years ago

i have left a pull request for this change, only a simple class change is required in line 174 to '.pv-profile-section'

RohitRana19 commented 1 year ago

I'm interested in working on this issue.