Closed kakliniew closed 1 month ago
@kakliniew, thank you for creating this issue. We will troubleshoot it as soon as we can.
Triage this issue by using labels.
If information is missing, add a helpful comment and then I-issue-template
label.
If the issue is a question, add the I-question
label.
If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted
label.
If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C),
add the applicable G-*
label, and it will provide the correct link and auto-close the
issue.
After troubleshooting the issue, please add the R-awaiting answer
label.
Thank you!
Hi, @kakliniew. Please follow the issue template, we need more information to reproduce the issue.
Either a complete code snippet and URL/HTML (if more than one file is needed, provide a GitHub repo and instructions to run the code), the specific versions used, or a more detailed description to help us understand the issue.
Note: If you cannot share your code and URL/HTML, any complete code snippet and URL/HTML that reproduces the issue is good enough.
Reply to this issue when all information is provided, thank you.
@diemol https://github.com/kakliniew/reproduce-selenium-edge-node here repo that allowed me to reproduced similar error. selenium-grid-argo.yaml - selenium-grid chart is the latest, only changed file is this argo-app with different values. I deployed this argo app in my minikube and added dns to ect/hosts. serenity-cucumber-starter - test repo - I used template project, only adjusted serenity.properties, serenity.version(build.gradle) and extended one test. You have to change serenity.properties (your selenium grid url) and run test using:
./gradlew test -D.cucumber.filter.tags=@green
During execution I receive the same error and session is stopped: Test is failed, but stacktrace seems different (but here tech stack is a little bit different too):
If you need more details, please let us know. I will try other browsers/versioons on Monday.
For the example project, the problem was solved by adding flags: Unfortunately for my base project, this did not help. I keep digging
webdriver {
capabilities {
"ms:edgeOptions" {
args = ["remote-allow-orgins=*", "test-type", "--remote-allow-origins=*","no-sandbox", "ignore-certificate-errors",
"incognito","disable-infobars", "disable-gpu", "disable-default-apps", "disable-popup-blocking", "disable-dev-shm-usage",
"--disable-dev-shm-usage", "disable-extensions", "disable-web-security", "disable-translate", "disable-logging"
]
}
}
}
Does the issue only happen on your Kubernetes environment?
I dont know. I'm stuck in a place. GOAWAY is not random - it always throws an error in the same place for a given test. I tried different browsers and still the same thing. Very strange that I have scenario outline and it differs only "user", and for some users it goes much further than for others.
I tried with serenity 4.0.15 (first version with selenium 4.14.1 so jdkhttp by default) and it fails. Changed to 4.0.14 (selenium 4.13.0) and everything is fine. Added org.seleniumhq.selenium:selenium-http-jdk-client:4.13.0 and System.setProperty("webdriver.http.factory", "jdk-http-client") and it fails again. So this problem only occurs when using jdkhttpclient. I will try to reproduce it in my example project.
Element: [[RemoteWebDriver: MicrosoftEdge on linux ] -> xpath: //button[normalize-space(text())='Submit']]
Session ID:
at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:629)
at app//org.openqa.selenium.remote.RemoteWebElement.execute(RemoteWebElement.java:224)
at app//org.openqa.selenium.remote.RemoteWebElement.getTagName(RemoteWebElement.java:135)
at app//net.serenitybdd.core.pages.WebElementFacadeImpl.getTagName(WebElementFacadeImpl.java:1298)
Caused by:
java.io.UncheckedIOException: java.io.IOException: /: GOAWAY received
at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:431)
at org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
at org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:347)
Caused by:
java.io.IOException: /: GOAWAY received
at java.net.http/jdk.internal.net.http.Http2Connection.handleGoAway(Http2Connection.java:999)
at java.net.http/jdk.internal.net.http.Http2Connection.handleConnectionFrame(Http2Connection.java:867)
at java.net.http/jdk.internal.net.http.Http2Connection.processFrame(Http2Connection.java:738)
at java.net.http/jdk.internal.net.http.frame.FramesDecoder.decode(FramesDecoder.java:155)
at java.net.http/jdk.internal.net.http.Http2Connection$FramesController.processReceivedData(Http2Connection.java:232)
at java.net.http/jdk.internal.net.http.Http2Connection.asyncReceive(Http2Connection.java:663)
Very very strange... I'm still trying. We have old method "waitUntilPageLoaded()" which was sleeping for 2000ms and then
WebDriverWait wait = new WebDriverWait(getDriver(), Duration.ofSeconds(10));
wait.until((ExpectedCondition<Boolean>) driver -> !checkModalOpen() && !checkIfModalWindowIsPresent());
checkIfModalWindowIsPresent() is execution of JS scripts - for some reason twice the same script (using JavascriptExecutor). When I removed duplicated execution then results we worse. When I added addtional third execution then results were reversed (for users for whom it failed early steps, it fail in later steps, for users for whom it failed in later steps it began to fail early steps). I removed whole waitUntilPageLoaded() and now it fails in the beggining in fourth step for all the users (of course still GOAWAY).
And strange steps time doesnt addup to execution time (but maybe thats bug)
I tried to reproduce the problem in the new repo, but I am not able to.
Any ideas?
I found one function that, if commented out, makes the test go on. I don't know if whole test works yet, because it crashes on one assertion and I don't have the energy to fix it today, but when not commented out it crashes much earlier due to GOAWAY. When I look at the function, there is nothing special about it, I will analyse it tomorrow
I am wondering about the http2 related error message, the selenium server should not use http2.
You could try to set version http1.1 in the ClientConfig to see what happens and what error is raised.
The main element of the mentioned function is findElements(), and then loop through the elements and âgetAtrributeâ for each element. I think the problem are some kind of limits. https://stackoverflow.com/questions/55087292/how-to-handle-http-2-goaway-with-java-net-httpclient
I tried to initializing webdriver in following way:
EdgeOptions options = new EdgeOptions();
WebDriver driver;
try{
driver = new RemoteWebDriver(new URL("SELENIUM_HUB_URL"), options);
} catch (MalformedURLException e) {
throw new RuntimeException(e);
}
driver.manage().window().setPosition(new Point(0,0));
driver.manage().window().maximize();
driver.get("BASE_PAGE");
page.setDriver(driver);
Mentioned page is: https://github.com/serenity-bdd/serenity-core/blob/main/serenity-core/src/main/java/net/serenitybdd/core/pages/PageObject.java This initialization works but I get the GOAWAY error for some users (the difference is that some have additional steps after logging in - dealing with popup)
@joerg1985 when I try to initialize webdriver like:
EdgeOptions options = new EdgeOptions().addArguments("--remote-allow-origins=*");
WebDriver driver;
try{
ClientConfig config = ClientConfig
.defaultConfig()
.baseUrl(new URL("SELENIUM_GRID_URL")) // here grid url should be specified
.version(HttpClient.Version.HTTP_1_1.name());
driver = RemoteWebDriver.builder().oneOf(options).config(config)
.build();
} catch (MalformedURLException e) {
throw new RuntimeException(e);
}
Then I get:
org.openqa.selenium.remote.http.ConnectionFailedException: JdkWebSocket initial request execution error
Build info: version: '4.18.1', revision: 'b1d3319b48'
System info: os.name: 'Windows 11', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.22'
Driver info: driver.version: unknown
at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.openSocket(JdkHttpClient.java:237)
at app//org.openqa.selenium.devtools.Connection.<init>(Connection.java:87)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.<init>(SeleniumCdpConnection.java:36)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.lambda$create$3(SeleniumCdpConnection.java:103)
Caused by:
java.net.http.HttpConnectTimeoutException: HTTP connect timed out
at java.net.http/jdk.internal.net.http.MultiExchange.toTimeoutException(MultiExchange.java:512)
at java.net.http/jdk.internal.net.http.MultiExchange.getExceptionalCF(MultiExchange.java:459)
at java.net.http/jdk.internal.net.http.MultiExchange.lambda$responseAsyncImpl$7(MultiExchange.java:386)
Caused by:
java.net.ConnectException: HTTP connect timed out
at java.net.http/jdk.internal.net.http.MultiExchange.toTimeoutException(MultiExchange.java:513)
... 10 more
I tried with HTTP_2 too, but same result. I couldnt reproduce this problem using serenity-cucumber-starter project, even when I used some command in loop 10k times.
Deployed new selenium grid with new helm chart and when I try to initialize the same way as above, then I get:
org.openqa.selenium.remote.http.ConnectionFailedException: JdkWebSocket initial request execution error
Build info: version: '4.18.1', revision: 'b1d3319b48'
System info: os.name: 'Windows 11', os.arch: 'amd64', os.version: '10.0', java.version: '11.0.22'
Driver info: driver.version: unknown
at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.openSocket(JdkHttpClient.java:250)
at app//org.openqa.selenium.devtools.Connection.<init>(Connection.java:87)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.<init>(SeleniumCdpConnection.java:36)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.lambda$create$3(SeleniumCdpConnection.java:103)
at java.base@11.0.22/java.util.Optional.map(Optional.java:265)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.create(SeleniumCdpConnection.java:103)
at app//org.openqa.selenium.devtools.SeleniumCdpConnection.create(SeleniumCdpConnection.java:49)
at app//org.openqa.selenium.devtools.DevToolsProvider.getImplementation(DevToolsProvider.java:50)
at app//org.openqa.selenium.devtools.DevToolsProvider.getImplementation(DevToolsProvider.java:29)
at app//org.openqa.selenium.remote.Augmenter.augment(Augmenter.java:207)
at app//org.openqa.selenium.remote.Augmenter.augment(Augmenter.java:178)
at app//org.openqa.selenium.remote.RemoteWebDriverBuilder.build(RemoteWebDriverBuilder.java:369)
at app//starter.StepDefinitions.setUp(StepDefinitions.java:53)
Caused by:
java.net.http.WebSocketHandshakeException
at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.resultFrom(OpeningHandshake.java:224)
at java.base/java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:1072)
at java.base/java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:506)
at java.base/java.util.concurrent.CompletableFuture.postFire(CompletableFuture.java:610)
at java.base/java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:840)
at java.base/java.util.concurrent.CompletableFuture$Completion.exec(CompletableFuture.java:479)
at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:183)
Caused by:
jdk.internal.net.http.websocket.CheckFailedException: Unexpected HTTP response status code 308
at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.checkFailed(OpeningHandshake.java:341)
at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.handleResponse(OpeningHandshake.java:250)
at java.net.http/jdk.internal.net.http.websocket.OpeningHandshake.resultFrom(OpeningHandshake.java:220)
... 10 more
I thought old helm chart might be a problem, because in my local cluster when I downgraded then I got this HTTP Connect timed out. I received strange warning in selenium-hub (Edit: that was 4.15.0, 4.22.0 is fine):
WARNING: Extension file or directory does not exist: /opt/selenium/selenium-http-jdk-client.jar
Enabled debug logs for selenium hub/selenium edge node and during 3 tests I found 1500x
DEBUG [RequestConverter.channelRead0] - Incoming message: DefaultHttpRequest(decodeResult: success, version: HTTP/1.1)
POST /se/grid/node/session HTTP/1.1
Connection: Upgrade, HTTP2-Settings
Content-Length: 252
Host:
HTTP2-Settings:
Upgrade: h2c
traceparent:
User-Agent: selenium/4.22.0 (java unix)
The same debug info is available even when I run tests using selenium 4.12.0
I'm still unable to create webdriver with clientconfig version http 1_1.
After optimizing some functions (reduce the number of http operations) test is passing about 6 out of 7 times.
This makes me think that the source of the problem is some nginx limits or something.
https://www.reddit.com/r/java/comments/1986h01/comment/ki5gczk
I was able to resolve the problem:
Caused by:
jdk.internal.net.http.websocket.CheckFailedException: Unexpected HTTP response status code 308
By just removing TLS host for selenium-grid-ingress. After that change I was able to initialize driver with HTTP_1_1, and tests pass. It solves the effect but not the root cause I guess.
@VietND96 have you seen this?
It would be nice to have flag added like:
selenium.httpclient.version=HTTP_1_1
HTTP2 is by default according to docs
[HttpClient.Builder] version
Requests a specific HTTP protocol version where possible.
If this method is not invoked prior to building, then newly built clients will prefer [HTTP/2].
It seems that this problem with jdkhttpclient HTTP/2, Nginx GOAWAY is widely known and you need to handle it or use HTTP/1.1 (as described in the source reddit, but also in other places).
Maybe a system property is not a bad idea.
As my understanding, this case has the ingress proxy involves
The error
Caused by: java.net.http.WebSocketHandshakeException ... Caused by: jdk.internal.net.http.websocket.CheckFailedException: Unexpected HTTP response status code 308
and you mentioned that
By just removing TLS host for selenium-grid-ingress. After that change I was able to initialize driver with HTTP_1_1, and tests pass.`
I think this relates to Server-side HTTPS enforcement through redirect - By default the controller redirects (308) to HTTPS if TLS is enabled for that ingress.
It means the TLS host for selenium-grid-ingress
only set when the grid server back-end is enabled HTTPS, otherwise the redirect will be incorrect. In helm chart README, I also tried to input something for secure connection in K8s, I think if we configure the proxy correctly, it will work
In ingress-nginx docs, they also mentioned Proxy HTTP version that the Nginx reverse proxy will use to communicate with the backend. By default this is set to "1.1". So without redirect issue, HTTP version doesn't matter
@VietND96 thanks for your answer. I wanted to set it up, but faced some problems. I will try again and let you know
@VietND96 I configured TLS. Hub/Nodes are up and running. I used external secret. During test execution I received once again GOAWAY and there is a message in edge node's logs:
16:16:53.229 WARN [ProxyNodeWebsockets$ForwardingListener.onError] - Error proxying websocket commandjava.net.SocketException: Connection reset
at java.base/sun.nio.ch.SocketChannelImpl.throwConnectionReset(Unknown Source)
at java.base/sun.nio.ch.SocketChannelImpl.read(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube.readAvailable(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription.read(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube$SocketFlowTask.run(Unknown Source)
at java.net.http/jdk.internal.net.http.common.SequentialScheduler$SchedulableTask.run(Unknown Source)
at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(Unknown Source)
at java.net.http/jdk.internal.net.http.common.SequentialScheduler.runOrSchedule(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$InternalReadSubscription.signalReadable(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube$InternalReadPublisher$ReadEvent.signalEvent(Unknown Source)
at java.net.http/jdk.internal.net.http.SocketTube$SocketFlowEvent.handle(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.handleEvent(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.lambda$run$3(Unknown Source)
at java.base/java.util.ArrayList.forEach(Unknown Source)
at java.net.http/jdk.internal.net.http.HttpClientImpl$SelectorManager.run(Unknown Source)
Initialized driver with HTTP1_1 works fine.
Yes, initialized the driver with HTTP1_1 in client config as I know it was supported via https://github.com/SeleniumHQ/selenium/pull/12919
Besides that, I also tried to find some interesting info for your reference.
In OpenJDK docs Introduction to the Java HTTP Client, I saw these statements
By default the client will send requests using HTTP/2. Requests sent to servers that do not yet support HTTP/2 will automatically be downgraded to HTTP/1.1
As I saw above, a comment also mentioned that the grid server side is not using HTTP/2, and suppose the client side is sending HTTP/2. That statement describes the correct behavior it should be.
I also found https://stackoverflow.com/questions/67018337/is-there-any-way-to-handle-http-2-goaway-received-ioexception-in-httpclient-java
Here, I saw the workaround set http version via client was mentioned. However, also have a point mentioned that I can understand something like GOAWAY
is an IOException
thrown by client HTTP/2, and when switching to client HTTP/1.1, it could go to a situation that would continue to get "random" IOExceptions
with a different message from the server using HTTP 1.1
Next, I found this https://stackoverflow.com/questions/55087292/how-to-handle-http-2-goaway-with-java-net-httpclient with these statements
I tried to reproduce this problem with a local nginx server with the default HTTP/2 configuration ... And after approximately 1000 requests, I get a GOAWAY error as expected:
I feel this is close to your setup, ingress nginx involves. I tried to search config key http2_max_requests
and saw this docs.
In recent versions of nginx, it changed to keepalive_requests with default is 1000.
If ingress nginx deployed via helm chart, you can refer that chart values, config key controller.config
with list objects can be set. So, can you try this also?
I tried setting these limits (keepalive_requests) for the local cluster (minikube) and started my local tests, but it didn't help - I couldn't reproduce the error. I checked the nginx-controller logs locally and in my âembeddedâ cluster today. The logs show that locally the communication is using http1.1 (even when I try to initialise the driver with version 2.0). On the other hand, in the âembeddedâ cluster the communication is with 2.0 (therefore it makes sense that this limit is reached and the goaway is thrown).
"POST /session/<session_id>/element HTTP/2.0" 200 28 "-" "selenium/4.18.1 (java windows)" <- <Embedded>
vs
"POST /session/<session_id>/element HTTP/1.1" 200 126 "-" "selenium/4.18.1 (java windows)" <- <local>
I can't really modify the nginx configuration in the âembedded clusterâ, because it handles not only selenium grid and I can cause problems. In my local cluster I have not configured http2 for nginx, and its by default off, hence the problem in reproducing.
HttpClient.Builder versionâ(HttpClient.Version version)
If this method is not invoked prior to building, then newly built clients will prefer HTTP/2.
JdkHttpClient by default sets http2, hence my problems when I don't explicitly specify it to use http1.1. Therefore system property would be a good solution.
selenium.httpclient.version=HTTP_1_1
@VietND96 today I accidentally discovered that if I use the url for selenium grid, but protocol http://
instead of https://
, the communication is via http1.1 and GOAWAY does not occur.
This makes a lot of sense, but I didn't think of it before.
Not sure my understanding is correct, based on your setup mentioned before with HTTPS (it requires TLS version as well as HTTP version, which is based on the ingress controller). Probably, your setup exposes http://
with HTTP/1.1 and https://
with TLS+HTTP/2
With ingress use HTTP/2, I think these configs upstream-keepalive-...
here needs to be consider.
For example upstream-keepalive-connections
- When this number is exceeded, the least recently used connections are closed. default: 320. With this description, it seems the message GOAWAY close to this.
JDKHttpClient also mentioned something if H2 (needs to try and see how it helps)
The JDK 11 ConnectionPool checks the
jdk.httpclient.keepalive.timeout
system property when initializing and defaults to 1200 seconds (20 minutes). Since 20 minutes is a long time in modern architecture, JDK20, build 26 reduces the default to 30 seconds. It also allow setting keepalive system property via a JVM argument e.g-Djdk.httpclient.keepalive.timeout=2
Since JDK20, build 25, some additional system properties were introduced specifically for H2 pools:jdk.httpclient.keepalivetimeout.h2
â set this property to control the keepalive timeout for H2 connectionsjdk.httpclient.maxstreams
â set this property to control the maximum number of H2 streams that are permitted per HTTP connection (defaults to 100).
At this point, setting an use-http2 to "false" looks like a good solution. I did it today and the tests work in the default config (without forcing in the code http.version http_1_1 and using https://
). Now HTTP/1.1 is used for http://
and https://
.
Previously for https://
HTTP/2 and TLS was the default (that's how it works in nginx ingress) and in combination with JdkHttpClient caused problems. I don't know exactly what limit was being exceeded, but that was the cause. If it helps you, I can try to configure HTTP/2 + TLS locally and test.
Thank you so much for all your replies, they have been very helpful and motivating, as at one point I was losing hope a bit!
This issue has been automatically locked since there has not been any recent activity since it was closed. Please open a new issue for related bugs.
What happened?
When I use serenity 4.0.1 (selenium 4.12.0) tests work correctly, but when I try to upgrade (e.g. 4.1.4 - selenium 4.18.1) then I receive this error after few seconds (always in same step) of execution (execution works fine but then suddenly GOAWAY).
I tried debugging, changing selenium grid/hub/node versions. I tried changing selenium. I noticed that the problem occurs when I use selenium with jdkhttpclient (so 4.14 and higher). In addition, when I did a downgrade to edge node 4.15.0 I got an additional stacktrace in the logs
How can we reproduce the issue?
Relevant log output
Operating System
windows/unix
Selenium version
4.18.1
What are the browser(s) and version(s) where you see this issue?
Tried 4.15.0, 4.21.0, 4.22.0
What are the browser driver(s) and version(s) where you see this issue?
all of them but stacktrace in 4.15.0 only
Are you using Selenium Grid?
4.15.0/4.21.0
Connected issue
https://github.com/SeleniumHQ/docker-selenium/issues/2195