spinlud / py-linkedin-jobs-scraper

MIT License
307 stars 84 forks source link

status 403 Forbidden #54

Closed stevemendis closed 1 year ago

stevemendis commented 1 year ago

I did set the LI_AT and did run the python file but I am running into this

INFO:li:scraper:('Using strategy AuthenticatedStrategy',) INFO:li:scraper:('Starting new query', "Query(query=Engineer options=QueryOptions(limit=5 locations=['United States'] filters=QueryFilters(relevance=RelevanceFilters.RECENT time=TimeFilters.MONTH type=[<TypeFilters.FULL_TIME: 'F'>, <TypeFilters.INTERNSHIP: 'I'>] experience=[<ExperienceLevelFilters.INTERNSHIP: '1'>, <ExperienceLevelFilters.MID_SENIOR: '4'>] on_site_or_remote=[<OnSiteOrRemoteFilters.REMOTE: '2'>]) optimize=False apply_link=True skip_promoted_jobs=True))") INFO:li:scraper:('Chrome debugger url', 'http://localhost:51165') ERROR:li:scraper:('[Engineer][United States]', WebSocketBadStatusException('Handshake status 403 Forbidden'))

Traceback (most recent call last): File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/linkedin_jobs_scraper/linkedin_scraper.py", line 282, in __run cdp.start() File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/linkedin_jobs_scraper/chrome_cdp/cdp.py", line 110, in start self._ws = websocket.create_connection( File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/websocket/_core.py", line 594, in create_connection websock.connect(url, *options) File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/websocket/_core.py", line 253, in connect self.handshake_response = handshake(self.sock, addrs, **options) File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/websocket/_handshake.py", line 79, in handshake status, resp = _get_resp_headers(sock) File "/Users/stevemendis/opt/anaconda3/lib/python3.9/site-packages/websocket/_handshake.py", line 165, in _get_resp_headers raise WebSocketBadStatusException("Handshake status %d %s", status, status_message, resp_headers) websocket._exceptions.WebSocketBadStatusException: Handshake status 403 Forbidden

prisikarm commented 1 year ago

@stevemendis try adding:

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--remote-allow-origins=*")

and add inside scraper = LinkedinScraper() set chrome_options = options

calvinomiguel commented 1 year ago

@stevemendis try adding:

from selenium import webdriver

options = webdriver.ChromeOptions()
options.add_argument("--remote-allow-origins=*")

and add inside scraper = LinkedinScraper() set chrome_options = options

when doing it, it's running but not headless. How to enable headless mode?

calvinomiguel commented 1 year ago

Okay never mind, I added:

options.add_argument("--headless=true")

spinlud commented 1 year ago

Added as default option in 2.0.7 https://github.com/spinlud/py-linkedin-jobs-scraper/blob/master/linkedin_jobs_scraper/utils/chrome_driver.py#L39

Thanks @stevemendis! 🍻