linkedtales / scrapedin-linkedin-crawler

Crawler for LinkedIn full profiles 2019
Apache License 2.0
215 stars 71 forks source link
crawler linkedin linkedin-crawler

Scrapedin Linkedin Crawler

Build Status Coverage Status

Crawls multiple linkedin public profiles based on initial given profiles. Unlikely other crawlers, Scrapedin Crawler is currently working for the new 2019 website. Each crawled profile is scraped using: scrapedin, a profile scraper library.

How to use

  1. Clone this repository
  2. Update config.json file with:
    • Your linkedin e-mail and password password or store it on SCRAPEDIN_EMAIL and SCRAPEDIN_PASSWORD envirorment variables. I recommend to not use your primary profile since it may be blocked.
    • keywords (optional): set words to filter next profiles to be crawled
    • root profiles: linkedin profiles urls that will start the crawler
  3. Ensure you have Node.js >= 7.6 in your machine:
  4. npm install to install dependencies
  5. npm start to start crawler

Tips

The profiles will be stored on the directory configured at config.json as individual file per profile. If you want to do something else (as saving on a database), just rewrite the src/saveProfile.js function.

Contribuiting

Please feel free to contribute with this project, just always open an issue before submiting a PR.

License

Apache 2.0