cullenwatson / StaffSpy

Staff scraper library for LinkedIn - obtain experiences, schools, skills & more
MIT License
52 stars 6 forks source link

not working in India #25

Closed vedantp2905 closed 1 month ago

vedantp2905 commented 1 month ago

for this code:

from staffspy import scrape_staff from pathlib import Path session_file = Path(file).resolve().parent / "session.pkl"

staff = scrape_staff( company_name="openai", search_term="software engineer", # optional extra_profile_data=True, # fetch all past experiences, schools, & skills

max_results=50, # can go up to 1000
session_file=str(session_file), # save login cookies to only log in once (lasts a week or so)
log_level=1,

) filename = f"staff.csv" staff.to_csv(filename, index=False)

Getting error:

Traceback (most recent call last): File "c:\Users\91917\Desktop\WeekendAI\Linkedin\app.py", line 5, in staff = scrape_staff( ^^^^^^^^^^^^^ File "C:\Users\91917\AppData\Local\Programs\Python\Python311\Lib\site-packages\staffspy__init__.py", line 25, in scrape_staff staff = li.scrape_staff( ^^^^^^^^^^^^^^^^ File "C:\Users\91917\AppData\Local\Programs\Python\Python311\Lib\site-packages\staffspy\linkedin\linkedin.py", line 186, in scrape_staff ib\site-packages\staffspy\linkedin\linkedin.py", line 55, in get_company_id raise Exception( Exception: ('Failed to find company openai', 500, '{"status":500}') P

cullenwatson commented 1 month ago

works for me on multiple accounts. are you in another country besides US?

cullenwatson commented 1 month ago

also try some other companies like google and see if same error

vedantp2905 commented 1 month ago

Yup, im currently in India. Would a VPN help? Did try google, but I got the same errors.

cullenwatson commented 1 month ago

ok must be different endpoints. ill look into it

cullenwatson commented 1 month ago

can you try again with latest code? as it tries to search for company if that part fails now. and is your url https://in.linkedin.com when you use linkedin or www.linkedin.com?

vedantp2905 commented 1 month ago

It's working now! I believe it is linkedin.com, but profiles take to linkedin.com/in

Also, any way I can find all employees of a given company. Per API call can fetch 1000 records, but OpenAI has around 3k employees. So, how can I prevent duplicates while scraping data?

cullenwatson commented 1 month ago

try finding the top locations and roles of openai. e.g. san francisco, london, new york city. and then for each location, try diff searcrh terms, e.g. accountant, software, finance. you just have to exhaustively search and be creative till you find them all.