Open geosmart opened 2 days ago
why did you change it from
MEILI_ADDR: http://meilisearch:7700
BROWSER_WEB_URL: http://chrome:9222
to your current setup?
why did you change it from
MEILI_ADDR: http://meilisearch:7700 BROWSER_WEB_URL: http://chrome:9222
to your current setup?
i will deploy the meilisearch seprately ,so I change it to a ip.
now http://meilisearch:17700
is working, andhttp://192.168.68.100:17700
is also working.
but chrome crawler is not working ,it get the ip but can not connect to the browser instance, i don't kown why
hoarder-web-1 | 2024-10-21T18:57:58.011Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222
hoarder-web-1 | 2024-10-21T18:57:58.012Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/
hoarder-web-1 | 2024-10-21T18:57:59.582Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
hoarder-web-1 | 2024-10-21T18:58:04.583Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222
hoarder-web-1 | 2024-10-21T18:58:04.583Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/
hoarder-web-1 | 2024-10-21T18:58:06.147Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs
hoarder-web-1 | 2024-10-21T18:58:11.148Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222
hoarder-web-1 | 2024-10-21T18:58:11.148Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/
logger.info(
`[Crawler] Connecting to existing browser instance: ${serverConfig.crawler.browserWebUrl}`,
);
const webUrl = new URL(serverConfig.crawler.browserWebUrl);
// We need to resolve the ip address as a workaround for https://github.com/puppeteer/puppeteer/issues/2242
const { address: address } = await dns.promises.lookup(webUrl.hostname);
webUrl.hostname = address;
logger.info(
`[Crawler] Successfully resolved IP address, new address: ${webUrl.toString()}`,
);
// error here
return puppeteer.connect({
browserURL: webUrl.toString(),
defaultViewport,
});
why puppeteer.connect can't connect to http://192.168.68.100:9222
@MohamedBassem I found my chrome container has some error
docker logs hoarder-chrome-1
[1022/034016.962076:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[1022/034017.253707:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[1022/034017.253862:ERROR:bus.cc(407)] Failed to connect to the bus: Failed to connect to socket /var/run/dbus/system_bus_socket: No such file or directory
[1022/034017.306863:WARNING:dns_config_service_linux.cc(427)] Failed to read DnsConfig.
[1022/034019.172004:INFO:policy_logger.cc(145)] :components/policy/core/common/config_dir_policy_loader.cc(118) Skipping mandatory platform policies because no policy file was found at: /etc/chromium/policies/managed
[1022/034019.172056:INFO:policy_logger.cc(145)] :components/policy/core/common/config_dir_policy_loader.cc(118) Skipping recommended platform policies because no policy file was found at: /etc/chromium/policies/recommended
[1022/034019.352144:WARNING:dns_config_service_linux.cc(427)] Failed to read DnsConfig.
DevTools listening on ws://0.0.0.0:9222/devtools/browser/dcf87fc8-ed86-4bcb-a020-cafe51606133
[1022/034019.440406:WARNING:bluez_dbus_manager.cc(248)] Floss manager not present, cannot set Floss enable/disable.
[1022/034020.315840:WARNING:sandbox_linux.cc(418)] InitializeSandbox() called with multiple threads in process gpu-process.
[1022/035519.442999:INFO:policy_logger.cc(145)] :components/policy/core/common/config_dir_policy_loader.cc(118) Skipping mandatory platform policies because no policy file was found at: /etc/chromium/policies/managed
is this make hoarder Failed to connect to the browser instance?
Describe the Bug
hoarder-web-1 | 2024-10-20T13:17:11.013Z info: Workers version: 0.18.0 hoarder-web-1 | 2024-10-20T13:17:11.032Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222 hoarder-web-1 | 2024-10-20T13:17:11.033Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/ hoarder-web-1 | (node:140) [DEP0040] DeprecationWarning: The
punycode
module is deprecated. Please use a userland alternative instead. hoarder-web-1 | (Usenode --trace-deprecation ...
to show where the warning was created) hoarder-web-1 | 2024-10-20T13:17:12.714Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs hoarder-web-1 | 2024-10-20T13:17:12.715Z info: Starting crawler worker ... hoarder-web-1 | 2024-10-20T13:17:12.716Z info: Starting inference worker ... hoarder-web-1 | 2024-10-20T13:17:12.716Z info: Starting search indexing worker ... hoarder-web-1 | 2024-10-20T13:17:12.717Z info: Starting tidy assets worker ... hoarder-web-1 | 2024-10-20T13:17:17.716Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222 hoarder-web-1 | 2024-10-20T13:17:17.717Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/ hoarder-web-1 | 2024-10-20T13:17:19.378Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs hoarder-web-1 | 2024-10-20T13:17:24.380Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222 hoarder-web-1 | 2024-10-20T13:17:24.380Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/ hoarder-web-1 | 2024-10-20T13:17:25.947Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs hoarder-web-1 | 2024-10-20T13:17:30.948Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222 hoarder-web-1 | 2024-10-20T13:17:30.948Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/ hoarder-web-1 | 2024-10-20T13:17:32.615Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secs hoarder-web-1 | 2024-10-20T13:17:36.265Z info: [Crawler][4] Will crawl "https://docs.hoarder.app/configuration" for link with id "vrmnjh84tvaby79xbbsl6l1c" hoarder-web-1 | 2024-10-20T13:17:36.265Z info: [Crawler][4] Attempting to determine the content-type for the url https://docs.hoarder.app/configuration hoarder-web-1 | 2024-10-20T13:17:37.616Z info: [Crawler] Connecting to existing browser instance: http://192.168.68.100:9222 hoarder-web-1 | 2024-10-20T13:17:37.616Z info: [Crawler] Successfully resolved IP address, new address: http://192.168.68.100:9222/ hoarder-web-1 | 2024-10-20T13:17:39.223Z error: [Crawler] Failed to connect to the browser instance, will retry in 5 secsSteps to Reproduce
.env
Expected Behaviour
the host ip is 192.168.68.100 in web container,is fine
why [Crawler] Failed to connect to the browser instance, will retry in 5 secs
Screenshots or Additional Context
No response
Device Details
No response
Exact Hoarder Version
v0.18.0