Closed ASz-IT closed 6 years ago
@ASzz This feature is published in 2.0.0
I haven't added output or input to json/csv yet, but it scrapes perfectly.
I also changed the name. Since this is not just a user scraper, but a full linkedin scraper, I felt that it was better to change it to linkedin_scraper
instead of linkedin_user_scraper
. Version 2.0.0
is published in both linkedin_scraper
and linkedin_user_scraper
. However, new bug fixes and features will only update linkedin_scraper
That's great news @joeyism! Thanks much fro your hard work :)
Unfortunately I still have issue with log in to linkedin (I'm from Poland - maybe here it's required log in to see anything). I add couple line of code and successful log in to site but after that i still have issues...
When i try use your library for Person or Company I always get errors like this:
Try doing this, in the exact order:
Run ipython
In ipython
, run the following code (you can modify it if you need to specify your driver)
from linkedin_scraper import Company
company = Company("https://ca.linkedin.com/company/google", scrape=False)
Login to Linkedin
Logout of Linkedin
In the same ipython
code, run
company.scrape(close_on_complete=False)
Does that work/throw an error?
@joeyism Hi Joey, I've got the same error, tried your recommendation above, login/out then run company.scrape(close_on_complete=False)
but ipython / jupyter launches a separate chrome window so linkedin asks to sign in again which results in the same:
Traceback (most recent call last):
File "C:\Users\...\Anaconda3\envs\scrapers\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-ce4450a2f698>", line 1, in <module>
company.scrape(close_on_complete=False)
File "C:\Users\...\Anaconda3\envs\scrapers\lib\site-packages\linkedin_scraper\company.py", line 80, in scrape
self.name = driver.find_element_by_class_name("name").text
File "C:\Users\...\AppData\Roaming\Python\Python36\site-packages\selenium\webdriver\remote\webdriver.py", line 555, in find_element_by_class_name
return self.find_element(by=By.CLASS_NAME, value=name)
File "C:\Users\...\AppData\Roaming\Python\Python36\site-packages\selenium\webdriver\remote\webdriver.py", line 955, in find_element
'value': value})['value']
File "C:\Users\...\AppData\Roaming\Python\Python36\site-packages\selenium\webdriver\remote\webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "C:\Users\...\AppData\Roaming\Python\Python36\site-packages\selenium\webdriver\remote\errorhandler.py", line 237, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"class name","selector":"name"}
(Session info: chrome=63.0.3239.132)
(Driver info: chromedriver=2.35.528161 (5b82f2d2aae0ca24b877009200ced9065a772e73),platform=Windows NT 10.0.16299 x86_64)
Hi @touringkg
It seems like Linkedin has changed its policy, so even having a cookie isn't enough. They probably did this to prevent scraping. Let me investigate into a solution
@touringkg You can scrape without logging out. Try it without logging out
Hi @joeyism
I'm fairly new to programming/Python (6 months in and taking classes at MIT while getting MBA), but I got the scraper working. How difficult would it be to add the capability to pull all employee names (essentially I'm looking to try to monitor the change in employees over time by keeping track of specific names). I'm trying to build myself but obviously, you are much more adept - particularly when it comes to parsing the website data.
Hi @kpking7 Do you want this done while you are logged in, or while you are logged out?
Logged out if possible.
Sent from my iPhone
On Mar 11, 2018, at 11:26 AM, Joey Sham notifications@github.com wrote:
Hi @kpking7 Do you want this done while you are logged in, or while you are logged out?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@kpking7 I don't actually know if logged out is possible. Logged in will take time to implement.
Well that’s fine as well. I’d love to help build but am less sure where to start.
Sent from my iPhone
On Mar 11, 2018, at 12:06 PM, Joey Sham notifications@github.com wrote:
@kpking7 I don't actually know if logged out is possible. Logged in will take time to implement.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
@kpking7
Scraping employees is done automatically in version 2.2.0
on. If you update your linkedin_scraper
, you should see it
Thanks so much. Will give it a try in the next couple days. Appreciate the work!
Hi, I would ask you to improve your library to collecting information about companies. List of companies should be read from external CSV file. On output we should get info about company, below is image with marked required sections:
If it is possible it would by great to get also list of employed people (only names and surnames). Output should be save in JSON format - one file on each companies list. It is possible, how you thing @joeyism??