ericfourrier / scrape-linkedin

Scrape a public LinkedIn profile.
MIT License
152 stars 51 forks source link

AttributeError when trying to parse the html file #9

Open popovvasile opened 6 years ago

popovvasile commented 6 years ago

Traceback (most recent call last): File "/usr/local/bin/pylinkedin", line 11, in load_entry_point('scrape-linkedin==0.1', 'console_scripts', 'pylinkedin')() File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/usr/local/lib/python2.7/dist-packages/pylinkedin/cli.py", line 37, in scrape pprint(linkedin_profile.to_dict()) File "/usr/local/lib/python2.7/dist-packages/pylinkedin/scraper.py", line 569, in to_dict 'name': self.name, File "/usr/local/lib/python2.7/dist-packages/pylinkedin/scraper.py", line 220, in name return extract_one(self.get_xp(self.xp_header, './/h1[@id="name"]/text()')) File "/usr/local/lib/python2.7/dist-packages/pylinkedin/scraper.py", line 162, in get_xp return clean(origin.xpath(path)) AttributeError: 'NoneType' object has no attribute 'xpath'

rafaelmbsouza commented 6 years ago

From what I noticed, linkedin sometimes does not load the public formatting for a profile. It sometimes will throw you behind a login wall if you attempt multiple times to retrieve profiles. Check if that is happening to you. If you trace back the error, the root cause lies in xp_header being NoneType, as the browser was not able to find it through its xpath. Also, if the page loading is a logged page, the formatting of the profile looks quite different. Therefore, you'll have some difficulty retrieving the info you are looking for.