omkarcloud / botasaurus

The All in One Framework to build Awesome Scrapers.
https://www.omkar.cloud/botasaurus/
MIT License
1.18k stars 107 forks source link

Fail screenshot saving #13

Closed prikazchikof closed 10 months ago

prikazchikof commented 10 months ago

Description

Google Maps scraper fail on many queries (around 10k and more)

Steps to Reproduce

Work machine is Windows Server with 4 gb RAM (it's enough for 16 threads as I test it)

  1. Load many queries (I'm loading just a links to websites)
  2. Run and wait

Actual behavior:

Error:

Failed to save screenshot
Closing Browser
Traceback (most recent call last):
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\base_task.py", line 192, in run_task
    close_driver(driver)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\base_task.py", line 181, in close_driver
    driver.close()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\bose_driver.py", line 335, in close
    return super().close()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 551, in close
    self.execute(Command.CLOSE)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 429, in execute
    self.error_handler.check_response(response)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 243, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: disconnected: Unable to receive message from renderer
  (failed to check if window was closed: disconnected: not connected to DevTools)
  (Session info: chrome=116.0.5845.141)
Stacktrace:
 GetHandleVerifier [0x005B37C3+48947]
 (No symbol) [0x00548551]
 (No symbol) [0x0044C92D]
 (No symbol) [0x0043E26E]
 (No symbol) [0x0043D09F]
 (No symbol) [0x0043D678]
 (No symbol) [0x0043C695]
 (No symbol) [0x00435811]
 (No symbol) [0x00435AC4]
 (No symbol) [0x0049D688]
 (No symbol) [0x00495053]
 (No symbol) [0x004716C7]
 (No symbol) [0x0047284D]
 GetHandleVerifier [0x007FFDF9+2458985]
 GetHandleVerifier [0x0084744F+2751423]
 GetHandleVerifier [0x00841361+2726609]
 GetHandleVerifier [0x00630680+560624]
 (No symbol) [0x0055238C]
 (No symbol) [0x0054E268]
 (No symbol) [0x0054E392]
 (No symbol) [0x005410B7]
 BaseThreadInitThunk [0x745962C4+36]
 RtlSubscribeWnfStateChangeNotification [0x77191B69+1081]
 RtlSubscribeWnfStateChangeNotification [0x77191B34+1028]`
During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\Admin\Desktop\google-maps-scraper-master\main.py", line 19, in <module>
    launch_tasks(*tasks_to_be_run)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\launch_tasks.py", line 54, in launch_tasks
    current_output = task.begin_task(current_data, task_config)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\base_task.py", line 219, in begin_task
    final = run_task(False, 0)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\base_task.py", line 214, in run_task
    close_driver(driver)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\base_task.py", line 181, in close_driver
    driver.close()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\bose\bose_driver.py", line 335, in close
    return super().close()
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 551, in close
    self.execute(Command.CLOSE)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\webdriver.py", line 429, in execute
    self.error_handler.check_response(response)
  File "C:\Users\Admin\AppData\Local\Programs\Python\Python311\Lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 243, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: disconnected: not connected to DevTools
  (failed to check if window was closed: disconnected: not connected to DevTools)
  (Session info: chrome=116.0.5845.141)
Stacktrace:
 GetHandleVerifier [0x005B37C3+48947]
 (No symbol) [0x00548551]
 (No symbol) [0x0044C92D]
 (No symbol) [0x0043D249]
 (No symbol) [0x0043D79A]
 (No symbol) [0x0043D738]
 (No symbol) [0x004326FD]
 (No symbol) [0x00432F8D]
 (No symbol) [0x0049D288]
 (No symbol) [0x00495053]
 (No symbol) [0x004716C7]
 (No symbol) [0x0047284D]
 GetHandleVerifier [0x007FFDF9+2458985]
 GetHandleVerifier [0x0084744F+2751423]
 GetHandleVerifier [0x00841361+2726609]
 GetHandleVerifier [0x00630680+560624]
 (No symbol) [0x0055238C]
 (No symbol) [0x0054E268]
 (No symbol) [0x0054E392]
 (No symbol) [0x005410B7]
 BaseThreadInitThunk [0x745962C4+36]
 RtlSubscribeWnfStateChangeNotification [0x77191B69+1081]
 RtlSubscribeWnfStateChangeNotification [0x77191B34+1028]`

Reproduces how often:

Every time when I start it, but after 30 minutes or more of work

Additional context

My log looks like that:

[7080:3484:0907/152340.096:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
[7080:3484:0907/152340.107:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
Done: V and B Le Mans
[7080:3484:0907/152342.742:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
[7080:3484:0907/152342.762:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
Done: V and B La Roche Nord
Filtered 5 links from 5.
View written JSON file at output/vandb-fr-in-france.json
View written CSV file at output/vandb-fr-in-france.csv
Closing Browser
Closed Browser
View Final Screenshot at tasks/1112/final.png
View written JSON file at output/all.json
Creating Driver with window_size=1920,1080 and user_agent=Mozilla/5.0 (Windows NT 10.0)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36

DevTools listening on ws://127.0.0.1:63583/devtools/browser/54835e3a-595b-4cce-8ce0-c9d1
f0639475
Launched Browser
[6804:3312:0907/152354.671:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
[6804:3312:0907/152354.717:ERROR:gles2_cmd_decoder_passthrough.cc(946)] ContextResult::k
FatalFailure: fail_if_major_perf_caveat + swiftshader
Fetched 5 links.
Creating Driver with window_size=1920,1080 and user_agent=Mozilla/5.0 (Windows NT 10.0)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/105.0.0.0 Safari/537.36
Chetan11-dev commented 10 months ago

Could you run it in Docker as described at https://github.com/omkarcloud/google-maps-scraper and let me know if it occurs?