Closed benzman81 closed 9 months ago
oh, forgot. We use aerokube/ggr:1.7.1 and aerokube/selenoid:1.11.0.
Hello, the message client disconnected means that some timeout occurs on client side, also timeout may occur on some balancer or nginx behind ggr. Please check and increase timeouts. Usually in big clusters there are many nginx servers are placed behind ggr and misconfiguration can be on one of them.
Alexander Andryashin.
чт, 23 нояб. 2023 г., 19:25 Markus Krüger @.***>:
oh, forgot. We use aerokube/ggr:1.7.1 and aerokube/selenoid:1.11.0.
— Reply to this email directly, view it on GitHub https://github.com/aerokube/ggr/issues/386#issuecomment-1824690884, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKY23JRUXSRLOHSVSXW2P3YF52F7AVCNFSM6AAAAAA7YALKPKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRUGY4TAOBYGQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
@aandryashin there are no load balancers or so involved. I increased the timeout now and we will see.
Ok, so now we dont have the issue anymore, but we needed to set the readTimeout
to 60 minutes for our tests. Without the new jdk httpclient or without using ggr a readTimeout
of 10 minutes was sufficient.
Hi, we use your great tool ggr and selenoid in a grid. Now we switched from selenium-java 4.8.2 to 4.15.0 and get a lot of these errors:
org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.
with the cause:The log of our GGR shows:
And on the VMs running Selenoid:
The tests work for up to selenium-java 4.13.0 where they are still using netty as a http client. The issue also occurs in 4.13.0 when you switch to the new jdk http client via system property. See here: https://www.selenium.dev/blog/2022/using-java11-httpclient/
I already looked at their code, and it seems to have something to do with readTimeout. We set here 10 minutes, which was fine before and was only supposed to be a timeout for waiting in the queue of the grid. At least, thats what we document in out git commit ;-) We set the timeout like this:
The error does not always happen, but aber 50 times in about 6000 parallel tests. I have one test, where I can reproduce it in 2 out of four times. Thats how I tracked it down to this readTimeout and to ggr. I suspect ggr and opened the bug here, because if we use the selenoid url directly, which is usally behind ggr, then the issue does not occur anymore.
Sadly, I could not strip down some sample code.
Any idea?