hoarder-app / hoarder

A self-hostable bookmark-everything app (links, notes and images) with AI-based automatic tagging and full text search
https://hoarder.app
GNU Affero General Public License v3.0
2.28k stars 77 forks source link

Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot #186

Closed scubanarc closed 1 month ago

scubanarc commented 1 month ago

I followed the upgrade instructions to get version 0.14 (release) and edited my .env file to include:

CRAWLER_FULL_PAGE_SCREENSHOT=true

Crawls now fail. Here's the docker logs from hoarder-workers-1 during a single crawl event:

2024-05-28T17:22:27.107Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:27.643Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:29.000Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:29.604Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:22:30.656Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:31.171Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:32.519Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:33.116Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:22:35.166Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:22:35.673Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:22:37.067Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:22:37.615Z info: [Crawler] The puppeteer browser got disconnected. Will attempt to launch it again. 2024-05-28T17:22:37.615Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222 2024-05-28T17:22:37.616Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/ 2024-05-28T17:22:37.618Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs 2024-05-28T17:22:42.617Z info: [Crawler] Connecting to existing browser instance: http://chrome:9222 2024-05-28T17:22:42.618Z info: [Crawler] Successfully resolved IP address, new address: http://172.26.0.5:9222/ 2024-05-28T17:23:35.170Z error: [Crawler][47] Crawling job failed: Error: Timed-out after 60 secs 2024-05-28T17:23:39.199Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:23:39.737Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:23:41.030Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:23:41.600Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot 2024-05-28T17:23:49.621Z info: [Crawler][47] Will crawl "https://www.simplyrecipes.com/recipes/carrot_top_pesto/" for link with id "v5h6o0jb40ba6z1st7kldvzx" 2024-05-28T17:23:50.122Z info: [Crawler][47] Successfully navigated to "https://www.simplyrecipes.com/recipes/carrot_top_pesto/". Waiting for the page to load ... 2024-05-28T17:23:51.426Z info: [Crawler][47] Finished waiting for the page to load. 2024-05-28T17:23:51.959Z error: [Crawler][47] Crawling job failed: ProtocolError: Protocol error (Page.captureScreenshot): Unable to capture screenshot

MohamedBassem commented 1 month ago

@scubanarc did you also add the chrome flag that's mentioned in the release note? I found that in my experiments, it was the solution to this problem

scubanarc commented 1 month ago

Ah, I missed that one. I added it and my problems went away.

Thanks for the fast reply!

MohamedBassem commented 1 month ago

perfect!