omkarcloud / botasaurus

The All in One Framework to build Awesome Scrapers.
https://www.omkar.cloud/botasaurus/
MIT License
1.16k stars 104 forks source link

unknown error: session deleted because of page crash #55

Closed Voharin closed 1 month ago

Voharin commented 5 months ago

When I run the docker service on a real server and make a request, I get the following error. However, when I run it on my local computer, this error does not appear and I see that it works properly.

v12tj Waiting 10 seconds before connecting to Chrome... v12tj 10.0.0.2 - - [05/Feb/2024 13:56:03] "GET /scrape?url=https://someurl.com HTTP/1.1" 500 - v12tj INFO:werkzeug:10.0.0.2 - - [05/Feb/2024 13:56:03] "GET /scrape?url=https://semizotomotivburdur.sahibinden.com HTTP/1.1" 500 - v12tj Traceback (most recent call last): v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1488, in call v12tj return self.wsgi_app(environ, start_response) v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1466, in wsgi_app v12tj response = self.handle_exception(e) v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1463, in wsgi_app v12tj response = self.full_dispatch_request() v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 872, in full_dispatch_request v12tj rv = self.handle_user_exception(e) v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 870, in full_dispatch_request v12tj rv = self.dispatch_request() v12tj File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 855, in dispatch_request v12tj return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args) # type: ignore[no-any-return] v12tj File "/app/main.py", line 20, in scrape v12tj result = parser(dealer_url) v12tj File "/app/boto_scraper.py", line 91, in parser v12tj return scrape_dealer_page() v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/decorators.py", line 633, in wrapper_browser v12tj current_result = run_task(data_item, False, 0) v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/decorators.py", line 484, in run_task v12tj driver = create_driver( v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/create_stealth_driver.py", line 263, in run v12tj return do_create_stealth_driver( v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/create_stealth_driver.py", line 234, in do_create_stealth_driver v12tj bypass_detection(remote_driver, raise_exception) v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/create_stealth_driver.py", line 193, in bypass_detection v12tj wait_till_cloudflare_leaves(driver, previous_ray_id, raise_exception) v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/create_stealth_driver.py", line 109, in wait_till_cloudflare_leaves v12tj current_ray_id = get_rayid(driver) v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/create_stealth_driver.py", line 91, in get_rayid v12tj ray = driver.text(".ray-id code") v12tj File "/usr/local/lib/python3.9/site-packages/botasaurus/anti_detect_driver.py", line 172, in text v12tj return el.text v12tj File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webelement.py", line 84, in text v12tj return self._execute(Command.GET_ELEMENT_TEXT)['value'] v12tj File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webelement.py", line 396, in _execute v12tj return self._parent.execute(command, params) v12tj File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/webdriver.py", line 429, in execute v12tj self.error_handler.check_response(response) v12tj File "/usr/local/lib/python3.9/site-packages/selenium/webdriver/remote/errorhandler.py", line 243, in check_response v12tj raise exception_class(message, screen, stacktrace) v12tj selenium.common.exceptions.WebDriverException: Message: unknown error: session deleted because of page crash v12tj from unknown error: cannot determine loading status v12tj from tab crashed v12tj (Session info: chrome=120.0.6099.109) v12tj Stacktrace: v12tj #0 0x55ae79f57f83 v12tj #1 0x55ae79c10b2b v12tj #2 0x55ae79bf816d v12tj #3 0x55ae79bf7882 v12tj #4 0x55ae79bf6586 v12tj #5 0x55ae79bf644a v12tj #6 0x55ae79bf47e1 v12tj #7 0x55ae79bf518a v12tj #8 0x55ae79c0607c v12tj #9 0x55ae79c1e7c1 v12tj #10 0x55ae79c246bb v12tj #11 0x55ae79bf592d v12tj #12 0x55ae79c1e459 v12tj #13 0x55ae79ca9204 v12tj #14 0x55ae79c89e53 v12tj #15 0x55ae79c51dd4 v12tj #16 0x55ae79c531de v12tj #17 0x55ae79f1c531 v12tj #18 0x55ae79f20455 v12tj #19 0x55ae79f08f55 v12tj #20 0x55ae79f210ef v12tj #21 0x55ae79eec99f v12tj #22 0x55ae79f45008 v12tj #23 0x55ae79f451d7 v12tj #24 0x55ae79f57124 v12tj #25 0x7feac06e3044

// my docker-compose.yaml version: "3" services: botoscrape: restart: "no" container_name: botasaurus shm_size: 2gb build: dockerfile: Dockerfile context: .

  volumes:
    - ./output:/app/output
    - ./tasks:/app/tasks
    - ./profiles:/app/profiles
    - ./profiles.json:/app/profiles.json
    - ./local_storage.json:/app/local_storage.json
  ports:
    - "8191:9090"

command: python -u main.py

  command: ["python", "-u", "main.py"]
  healthcheck:
    test: ["CMD", "curl", "-f", "http://localhost:9090/"]
    interval: 1m
    timeout: 10s
    retries: 3

// Dockerfile

FROM chetan1111/botasaurus:latest ENV PYTHONUNBUFFERED=1

COPY requirements.txt .

RUN python -m pip install -r requirements.txt

RUN mkdir app WORKDIR /app COPY . /app

Chetan11-dev commented 4 months ago

It should work in docker, see if it due to low memory. By trying on a bigger server. Also, you can test it in docker as follows:

docker-compose build && docker-compose  up
Voharin commented 4 months ago

I already tried with docker and limitless server. see my docker.yaml and dockerfile. i noticed that, when running intel chip everthing is ok but running with m1 or something not.

Chetan11-dev commented 4 months ago

Kindly, run this command

python -m pip install botasaurus --upgrade

Also, Could you re try with https://github.com/omkarcloud/botasaurus-starter. And share output of

docker-compose build && docker-compose  up
leonidcan commented 3 months ago

It works in docker but CF detects and blocks :(

leonidcan commented 3 months ago

Kindly, run this command

python -m pip install botasaurus --upgrade

Also, Could you re try with https://github.com/omkarcloud/botasaurus-starter. And share output of

docker-compose build && docker-compose  up

Attaching to botasaurus-starter-bot-1-1 Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error mounting "/run/desktop/mnt/host/c/Work/botasaurus-starter/db.sqlite3" to rootfs at "/app/db.sqlite3": mount /run/desktop/mnt/host/c/Work/botasaurus-starter/db.sqlite3:/app/db.sqlite3 (via /proc/self/fd/9), flags: 0x5000: not a directory: unknown: Are you trying to mount a directory onto a file (or vice-versa)? Check if the specified host path exists and is the expected type

Chetan11-dev commented 3 months ago

Fixed, Kindly reclone and run "docker-compose build && docker-compose up"