dgtlmoon / changedetection.io

The best and simplest free open source web page change detection, website watcher, restock monitor and notification service. Restock Monitor, change detection. Designed for simplicity - Simply monitor which websites had a text change for free. Free Open source web page change detection, Website defacement monitoring, Price change notification
https://changedetection.io
Apache License 2.0
17.3k stars 965 forks source link

Playwright fails with Protocol error (Browser.getVersion): undefined #586

Closed ginkel closed 2 years ago

ginkel commented 2 years ago

Describe the bug After setting up a sidecar playwright-chrome Docker container as documented in the sample docker-compose.yml I seem to be unable to get Playwright-based fetching to work. When processing a request, the following error is logged:

Protocol error (Browser.getVersion): undefined
=========================== logs ===========================
<ws connecting> ws://playwright-chrome:3000/playwright
<ws connected> ws://playwright-chrome:3000/playwright
============================================================

The playwright-chrome container shows the following log:

2022-05-07T14:47:49.270Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: /playwright: Inbound WebSocket request.
2022-05-07T14:47:49.280Z browserless:hardware Checking overload status: CPU 13% Memory 41%
2022-05-07T14:47:49.282Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Adding new job to queue.
2022-05-07T14:47:49.283Z browserless:server Starting new job
2022-05-07T14:47:49.283Z browserless:system Generating fresh chrome browser
2022-05-07T14:47:49.284Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Getting browser.
2022-05-07T14:47:49.287Z browserless:chrome-helper Launching Chrome with args: {
  "args": [
    "--no-sandbox",
    "--enable-logging",
    "--v1=1",
    "--disable-dev-shm-usage",
    "--no-first-run",
    "--remote-debugging-port=36865"
  ],
  "blockAds": false,
  "dumpio": false,
  "headless": true,
  "stealth": false,
  "ignoreDefaultArgs": false,
  "ignoreHTTPSErrors": false,
  "pauseOnConnect": false,
  "playwright": true,
  "meta": {
    "protocol": null,
    "slashes": null,
    "auth": null,
    "host": null,
    "port": null,
    "hostname": null,
    "hash": null,
    "search": null,
    "query": {},
    "pathname": "/playwright",
    "path": "/playwright",
    "href": "/playwright"
  },
  "executablePath": "/usr/bin/google-chrome",
  "handleSIGINT": false,
  "handleSIGTERM": false,
  "handleSIGHUP": false
}
2022-05-07T14:47:50.020Z browserless:chrome-helper Chrome PID: 21
2022-05-07T14:47:50.021Z browserless:chrome-helper Finding prior pages
2022-05-07T14:47:50.022Z browserless:system Chrome launched 739ms
2022-05-07T14:47:50.022Z browserless:system Got chrome instance
2022-05-07T14:47:50.023Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Starting session.
2022-05-07T14:47:50.023Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Proxying request to /playwright route: ws://127.0.0.1:40167/e23d1b172e79eb5d36314f8ef9d8d913.
2022-05-07T14:47:50.052Z browserless:server HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Recording successful stat and cleaning up.
2022-05-07T14:47:50.053Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Cleaning up job
2022-05-07T14:47:50.053Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Browser not needed, closing
2022-05-07T14:47:50.053Z browserless:chrome-helper Shutting down browser with close command
2022-05-07T14:47:50.053Z browserless:job HR4O6LIYMTQTGOYSXDYFS1U7ME6ALCZT: Browser cleanup complete.
2022-05-07T14:47:50.054Z browserless:server Current workload complete.
2022-05-07T14:47:50.054Z browserless:chrome-helper Sending SIGKILL signal to browser process 21
2022-05-07T14:47:50.065Z browserless:chrome-helper Removing temp data-dir /tmp/browserless-data-dir-adGsFN
2022-05-07T14:47:50.072Z browserless:chrome-helper Temp dir /tmp/browserless-data-dir-adGsFN removed successfully

cd.io logs:

ERROR:changedetectionio:Exception reached processing watch UUID: 4e8bd330-9b4c-49e3-bd48-a4e39c05e746 - Protocol error (Browser.getVersion): undefined
=========================== logs ===========================
<ws connecting> ws://playwright-chrome:3000/playwright
<ws connected> ws://playwright-chrome:3000/playwright
============================================================
WARNING:asyncio:Loop <_UnixSelectorEventLoop running=False closed=True debug=False> that handles pid 38 is closed

Docker setup:

CONTAINER ID   IMAGE                                                        COMMAND                  CREATED          STATUS                      PORTS                                                                                               NAMES
9938792ec7f9   browserless/chrome:latest                                    "./start.sh"             9 minutes ago    Up 9 minutes                3000/tcp                                                                                            playwright-chrome
5465cb5d9670   ghcr.io/dgtlmoon/changedetection.io:latest                   "python ./changedete…"   11 minutes ago   Up 11 minutes               5000/tcp                                                                                            changedetection

PLAYWRIGHT_DRIVER_URL=ws://playwright-chrome:3000/playwright

Version v0.39.12 (ghcr.io/dgtlmoon/changedetection.io:latest)

To Reproduce Steps to reproduce the behavior:

  1. Set up a new watch using Playwright Chromium/Javascript
  2. Wait for the fetch to happen and fail

Expected behavior The watched page is successfully fetched.

Screenshots n/a

Desktop (please complete the following information):

Smartphone (please complete the following information): n/a

Additional context n/a

dgtlmoon commented 2 years ago

try just

PLAYWRIGHT_DRIVER_URL=ws://playwright-chrome:3000

no /playwright