ericfourrier / scrape-linkedin

Scrape a public LinkedIn profile.
MIT License
152 stars 51 forks source link

Linkedin keeps blocking #4

Open ceased-ebc opened 6 years ago

ceased-ebc commented 6 years ago

Any recommendations to overcome? @ericfourrier

ceased-ebc commented 6 years ago

Any recommendations on this?

Also how to output the file to a json or html or php or what ever as i use -f and get nothing, then i get blocked...

afrozhussain commented 6 years ago

you can output into json like following

import json from pylinkedin.scraper import LinkedinItem

l = LinkedinItem(url= str.strip(URL) ) l_json = json.dumps(l.to_dict()) out_filename = base_folder + str(emp_id) + '.json'
f = open(file=out_filename, mode="w") f.write(l_json) f.close()

ceased-ebc commented 6 years ago

Thanks

rochenka commented 5 years ago

I've also got blocks by Linkedin. I managed to fix that by using https://proxycrawl.com/scraping-api-avoid-captchas-blocks which acts as middleware that I could use with this library. it also has this package https://github.com/proxycrawl/proxycrawl-python