spinlud / py-linkedin-jobs-scraper

MIT License
307 stars 84 forks source link

"Timed out receiving message from renderer" #40

Open pogarek opened 1 year ago

pogarek commented 1 year ago

HI

Any idea how to not get "Timed out receiving message from renderer" error from Selenium while scrapping with this, AMAZING, script ?

Around 50% of downloads are timing out. When I set "Headless = False" I can see, that jobs are listed, but website still has loading gear in the tab name, like something is preventing page from getting load completely. I've tried to use Chrome 104 and 105 , also tried with chrome driver options - nothing helped.

Any ideas how to fix it?

[0828/124933.842:INFO:CONSOLE(15640)] "Unable to get user settings while calling loading container tag [object XMLHttpRequest]", source: https://static-exp1.licdn.com/sc/h/8jc8ql3b2opadbdqlfn9lirut (15640)
ERROR:li:scraper:('[my search query][EMEA]', TimeoutException('timeout: Timed out receiving message from renderer: 119.717\n  (Session info: headless chrome=105.0.5195.52)', None, None))
Traceback (most recent call last):
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\linkedin_scraper.py", line 286, in __run
    self._strategy.run(
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\strategies\authenticated_strategy.py", line 545, in run
    paginate_result = AuthenticatedStrategy.__paginate(driver, search_url, tag, offset)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\strategies\authenticated_strategy.py", line 127, in __paginate
    driver.get(url)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\webdriver.py", line 333, in get
    self.execute(Command.GET, {'url': url})
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout: Timed out receiving message from renderer: 119.717
  (Session info: headless chrome=105.0.5195.52)

[0828/124933.863:INFO:CONSOLE(2216)] "[object Object]", source: https://static-exp1.licdn.com/sc/h/7z9jqmmzw6aba0xifibyci06h (2216)
[ON_ERROR] Message: timeout: Timed out receiving message from renderer: 119.717
  (Session info: headless chrome=105.0.5195.52)

Traceback (most recent call last):
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\linkedin_scraper.py", line 286, in __run
    self._strategy.run(
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\strategies\authenticated_strategy.py", line 545, in run
    paginate_result = AuthenticatedStrategy.__paginate(driver, search_url, tag, offset)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\linkedin_jobs_scraper\strategies\authenticated_strategy.py", line 127, in __paginate
    driver.get(url)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\webdriver.py", line 333, in get
    self.execute(Command.GET, {'url': url})
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\xxxxxxx\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.TimeoutException: Message: timeout: Timed out receiving message from renderer: 119.717
  (Session info: headless chrome=105.0.5195.52)

[0828/124933.902:INFO:CONSOLE(0)] "Uncaught (in promise) AbortError: Aborted", source: https://www.linkedin.com/jobs/search/?currentJobId=YYYYYYYYYY&f_TPR=rMyTPR&keywords=mySearchCriteria&location=EMEA&sortBy=DD&start=50 (0)
[0828/124933.904:INFO:CONSOLE(0)] "Uncaught (in promise) AbortError: Aborted", source: https://www.linkedin.com/jobs/search/?currentJobId=YYYYYYYYYY&f_TPR=rMyTPR&keywords=mySearchCriteria&location=EMEA&sortBy=DD&start=50 (0)
[0828/124933.904:INFO:CONSOLE(0)] "Uncaught (in promise) AbortError: Aborted", source: https://www.linkedin.com/jobs/search/?currentJobId=YYYYYYYYYY&f_TPR=rMyTPR&keywords=mySearchCriteria&location=EMEA&sortBy=DD&start=50 (0)
[0828/124933.905:INFO:CONSOLE(0)] "Uncaught (in promise) AbortError: Aborted", source: https://www.linkedin.com/jobs/search/?currentJobId=YYYYYYYYYY&f_TPR=rMyTPR&keywords=mySearchCriteria&location=EMEA&sortBy=DD&start=50 (0)
WARNING:urllib3.connectionpool:Connection pool is full, discarding connection: 127.0.0.1. Connection pool size: 1
WARNING:li:scraper:('[my search query][EMEA]', 'Error in response', 'https://www.linkedin.com/jobs/search/?currentJobId=YYYYYYYYYY&f_TPR=rMyTPR&keywords=mySearchCriteria&location=EMEA&sortBy=DD&start=50', 'request_id=5624.2052 status=404 type=XHR mime_type=application/vnd.linkedin.normalized+json+2.1 url=https://www.linkedin.com/voyager/api/voyagerMessagingDashAwayStatus')
[ON_END]

Same issue appears in freshly installed Ubuntu in WSL (Windows Subsystem for Linux)..