Closed DyuthiVivek closed 1 year ago
hi @nikhil25803 i also used web-scraping in my project named as course differentiator using beautiful soup where I scrape the data of course providing platforms like coding ninjas, Coursera, etc. I can easily scrape the LinkedIn data easily please assign this task to me I m a GSSOC'23 contributor
Sure, go ahead @mvpfrever. Make a different module and class first as per the project structure. Do not add multiple methods at first. Start with your name and bio.
So @mvpfrever - Create a Separate class and create a method to get the name on the providing account ID.
Eg. - For the URL https://www.linkedin.com/in/nikhil25803/
, my Id is nikhil25803
. Create a method .get_name()
to scrape the name.
And @mvpfrever - You work on the scraping of the Bio of the user.
You two can connect for this issue, so that do not create a separate module at once.
@nikhil25803 sir I have to scrape both name and bio? And 1 more thing we have to scrape a data of any organization's employee or whole the linkdin user because there are so many user in LinkedIn
Scrape the name only at first @mvpfrever. And we have to scrape the data of a particular user based on the user ID provided. I have shown you the example as well.
@nikhil25803 sir the scraping of name by link provided by user is done
@nikhil25803 Since I have created this issue and have asked for it to be assigned to me first, can I have the first shot at it?
Sure @DyuthiVivek !! Get in touch with @mvpfrever and co-ordinate on this.
Sir name scrapped what next?
@nikhil25803 I am working on scraping the bio, I will be done by the weekend.
Hi, I'm Babar Rasheed (Contributor GSSOC'23) Many websites don't offer API so to tackle this we can use Web Scraping to access data in an easy and structured manner. Python libraries like bs4, BeautifulSoup, Scrapy, Selenium, etc. are generally used for web scraping. Here I'm willing to apply these libraries and use an effective way of Multiprocessing to speed up Web Scraping. Multiprocessing is very helpful when multiple URLs are scraped to get the data. It will perform scraping on multiple URLs thus saving our time.
@DyuthiVivek is bio scraping done?
@mvpfrever I am working on it and will raise my pr as soon as I am done. If there is no dependency for you, you can raise your pr.
Guys @mvpfrever and @DyuthiVivek, any updates?
@nikhil25803 sir I already completed my part waiting for your call what next I have to do
Make a PR @mvpfrever
@nikhil25803 LinkedIn seems to be resisting scraping by throwing a captcha or by forcing to add a verification code sent via e-mail. This breaks the scraping logic. It seems to work for only some profiles. Any tips on how to avoid verification?
@DyuthiVivek | Maybe that is the issue, I do not have any solution in mind for now. If possible, give it one more try, else close the issue for now.
@nikhil25803 closing the issue.
Hi @nikhil25803, I am a GSSOC'23 contributor
A feature that gets the LinkedIn information of a user could be added. Using web scraping, we can fetch information about a user's profile on LinkedIn, such as: bio, education, experiences, activity, connections, followers, etc.
Kindly assign me this issue.