aerokube / selenoid

Selenium Hub successor running browsers within containers. Scalable, immutable, self hosted Selenium-Grid on any platform with single binary.
https://aerokube.com/selenoid/latest/
Apache License 2.0
2.57k stars 322 forks source link

UnreachableBrowserException: Error communicating with the remote browser. It may have died #1221

Closed v3g3t4x closed 7 months ago

v3g3t4x commented 2 years ago

Hi, I am using Selenoid 1.10.3 Java client 3.14.0 JVM 1.8 Tested on Chrome 62,77,90

Daily I run hundred parallel test without issues invoking with remotedriver my test on a selenoid remote grid. Sometimes my application raise this error: org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died And Caused by: java.net.SocketException: Connection timed out (Read failed)

This issue occurs when I invoke this part of code (but sometimes not always): JavascriptExecutor javascript = (JavascriptExecutor) this.localDriver; statusValue = (String) javascript.executeScript("return document.readyState"); log.info("Status javascript:" + statusValue);

Seems an hang on network level when execute javascript.executeScript("return document.readyState"); After 10 minutes it raise the exception below and after the test continue without problem because the container is not died. In the meanwhile during the 10 minute if I send a get/post request with postman to selenoid for that session for example to retrieve screenshot, works fine.

FULL STACK: org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died. Build info: version: '3.14.0', revision: 'aacccce0', time: '2018-08-02T20:19:58.91Z' System info: host: 'atfprocessor-163-pqg9s', ip: '10.252.53.245', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-1062.1.2.el7.x86_64', java.version: '1.8.0_242' Driver info: driver.version: RemoteWebDriver Capabilities {acceptInsecureCerts: false, acceptSslCerts: false, applicationCacheEnabled: false, browserConnectionEnabled: false, browserName: chrome, chrome: {chromedriverVersion: 77.0.3865.40 (f484704e052e0..., userDataDir: /tmp/.com.google.Chrome.QvDEQY}, cssSelectorsEnabled: true, databaseEnabled: false, goog:chromeOptions: {debuggerAddress: localhost:38756}, handlesAlerts: true, hasTouchScreen: false, javascriptEnabled: true, locationContextEnabled: true, mobileEmulationEnabled: false, nativeEvents: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), rotatable: false, setWindowRect: true, strictFileInteractability: false, takesHeapSnapshot: true, takesScreenshot: true, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unexpectedAlertBehaviour: ignore, unhandledPromptBehavior: ignore, version: 77.0.3865.75, webStorageEnabled: true} Session ID: e2d9d9628c43d72234fa4ad6a0e95874 at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:569) ~[selenium-remote-driver-3.14.0.jar:?] at org.openqa.selenium.remote.RemoteWebDriver.executeScript(RemoteWebDriver.java:485) ~[selenium-remote-driver-3.14.0.jar:?] and after as latest exception: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242] Caused by: java.net.SocketException: Connection timed out (Read failed) at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_242] at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_242] at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_242] at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_242] at org.apache.http.impl.io.SessionInputBufferImpl.streamRead(SessionInputBufferImpl.java:137) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:153) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:282) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:138) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:56) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:259) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.DefaultBHttpClientConnection.receiveResponseHeader(DefaultBHttpClientConnection.java:163) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.conn.CPoolProxy.receiveResponseHeader(CPoolProxy.java:165) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:273) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) ~[httpcore-4.4.6.jar:4.4.6] at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:272) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:185) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:111) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:72) ~[httpclient-4.5.3.jar:4.5.3] at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56) ~[httpclient-4.5.3.jar:4.5.3] at org.openqa.selenium.remote.internal.ApacheHttpClient.fallBackExecute(ApacheHttpClient.java:155) ~[selenium-remote-driver-3.14.0.jar:?] at org.openqa.selenium.remote.internal.ApacheHttpClient.execute(ApacheHttpClient.java:97) ~[selenium-remote-driver-3.14.0.jar:?] at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:155) ~[selenium-remote-driver-3.14.0.jar:?] at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:548) ~[selenium-remote-driver-3.14.0.jar:?] ... 12 more

vania-pooh commented 2 years ago

@v3g3t4x probably you are just overloading your instance. Try to lower parallel execution limit.

v3g3t4x commented 2 years ago

@v3g3t4x probably you are just overloading your instance. Try to lower parallel execution limit.

No this happen with 1 test only and with 8 in parallel is the same. I can add another info. This seems happen with one site only. What can cause this behavior in a web site? For all test in other website all works fine but if I navigate in this CRM site the hang happen. But during the hang if I invoke an executescript or a screenshot with postman via api on the same selenoid with same session selenoid respond without issue but test case is in hang.

vania-pooh commented 2 years ago

@v3g3t4x probably your browser is just crashing. Try to increase shmSize parameter in browsers.json. https://aerokube.com/selenoid/latest/#_other_optional_fields

v3g3t4x commented 2 years ago

But test after 10 min of hang continue to work on the same selenoid.Browser is up and running

vania-pooh commented 2 years ago

@v3g3t4x anyway take a look at system metrics of your host machine. Probably disk is overloaded.

v3g3t4x commented 2 years ago

In this case, why if I invoke a single request via postman respond without issue?

vania-pooh commented 2 years ago

@v3g3t4x every Selenium session consists of dozens of such requests. Every action is a separate HTTP request.

v3g3t4x commented 2 years ago

@v3g3t4x every Selenium session consists of dozens of such requests. Every action is a separate HTTP request.

I know, restarted server and all selenoid node. Error occurs...no resource issue

vania-pooh commented 2 years ago

@v3g3t4x also make sure you have recent Docker version.

v3g3t4x commented 2 years ago

@v3g3t4x also make sure you have recent Docker version.

But works 99% of times..I don't think is related to a docker issue

vania-pooh commented 2 years ago

@v3g3t4x anyway we need more diagnostics from your side. Could also be a firewall issue where some resource is not loading completely.

github-actions[bot] commented 8 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.