[🐛 Bug]: "--headlessmode=new" with Chromedriver 128 in container results in SessionNotCreatedException

1dEraNCeSIv0 commented 2 weeks ago

What happened?

Recently we upgraded our Jenkins to the latest version and most of our (Java based) Selenium tests started failing in their pipelines, SessionNotCreatedException caused by a timeout in org.openqa.selenium.remote.http.AddSeleniumUserAgent.

Upon further investigation we found that the following combination of circumstances causes consistent failure:

Use chromium 128 / chromedriver 128
Use the new headless mode as --headless=new
Run from a docker container

I've browsed the issues here to check if it's been reported before and it looks similar to this issue, might be the same cause.

For now our workaround is to downgrade the chromium / chromedriver version our Jenkins runs with. We could also switch our tests to --headless=old but I see that as a fix of last resort. I'd much rather Selenium and new chromedriver versions work together out of the box, even in new headless mode.

How can we reproduce the issue?

See this repository for a minimal reproducing example. For instructions on how to reproduce the issue please see the readme.

Relevant log output

> Task :test

SeleniumTest > headlessNew() FAILED
    org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
    Host info: host: '83b1fb817f96', ip: '172.17.0.1'
        at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:545)
        at app//org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:234)
        at app//org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:163)
        at app//org.openqa.selenium.chromium.ChromiumDriver.<init>(ChromiumDriver.java:114)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:88)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:83)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:72)
        at app//SeleniumTest.createChromeDriver(SeleniumTest.java:26)
        at app//SeleniumTest.headlessNew(SeleniumTest.java:9)

        Caused by:
        org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
        Build info: version: '4.23.0', revision: '4df0a231af'
        System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.10.0-18-amd64', java.version: '21.0.4'
        Driver info: driver.version: ChromeDriver
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:399)
            at app//org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
            at app//org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:355)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:89)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:75)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:61)
            at app//org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.invokeExecute(DriverCommandExecutor.java:216)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:174)
            at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:527)
            ... 8 more

            Caused by:
            java.util.concurrent.TimeoutException
                at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
                at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
                at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:382)
                ... 18 more

SeleniumTest > headlessOld() STANDARD_ERROR
    Aug 29, 2024 12:25:03 PM org.openqa.selenium.devtools.CdpVersionFinder findNearestMatch
    WARNING: Unable to find an exact match for CDP version 128, returning the closest version; found: 127; Please update to a Selenium version that supports CDP version 128

Gradle Test Executor 1 finished executing tests.

Operating System

Alpine 3.20, Debian 12

Selenium version

4.19.1, 4.23

What are the browser(s) and version(s) where you see this issue?

Chrome 128

What are the browser driver(s) and version(s) where you see this issue?

Chromedriver 128

Are you using Selenium Grid?

No

github-actions[bot] commented 2 weeks ago

@1dEraNCeSIv0, thank you for creating this issue. We will troubleshoot it as soon as we can.

Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

pujagani commented 2 weeks ago

Thank you for sharing the details. I tried to reproduce the issue but was not able to.

Docker command: docker run --rm -it -p 4444:4444 -p 5900:5900 -p 7900:7900 --shm-size 2g selenium/standalone-chromium:latest

Selenium Java code:

public class ChromeHeadlessv128 {

  public static void main(String[] argv) throws Exception {
    ChromeOptions options = new ChromeOptions();
    options.addArguments("--headless=new");

    WebDriver driver = new RemoteWebDriver(options, false);
    driver.get("https://www.google.com/");

    driver.getTitle();
    driver.quit();
  }
}

Is the error happening each time? or is it intermittent? How can we reproduce this?

pujagani commented 2 weeks ago

I am able to reproduce it if I run multiple sessions in parallel or run multiple sessions sequentially.

1dEraNCeSIv0 commented 2 weeks ago

I tried reproducing it again locally and noticed that the image was broken due to line endings changing upon up- and download. I've also had to allow newer chrome versions as the alpine repo doesn't seem to keep the specific 128 version that was up to date yesterday available. Note that this means that once 129 becomes available the Dockerfile will probably build that into the image instead. But I don't know of any quick way to pin the version.

Long story short, I believe I've fixed the Dockerfile and the following steps should now work again to reproduce the issue using the repository linked above:

Clone the repo
Navigate to project root folder
run docker build -t reproducer . (or any other image name)
run docker run reproducer "/root/gradlew -i test"
wait for the timeout to happen (around 5min)

If there's any issues with the image please let me know. It should cause the issue consistently, my error-rate so far is 100% in maybe 10 attempts. Regarding parallelism or running multiple sessions, the demo repo above uses the default settings for all of these - but I'm not sure what these are.

VietND96 commented 1 week ago

@pujagani, can you try to reproduce the same again with image selenium/standalone-chromium:latest (updated on Aug-31 1:00 AM IST). I guess it appears from chromium version 128.0.6613.113

pujagani commented 6 days ago

@VietND96 Thank you! Let me try it out and provide my findings here.

pujagani commented 6 days ago

I am able to reproduce the issue (using https://github.com/SeleniumHQ/selenium/issues/14457#issuecomment-2320919019)but not consistently, it failed one time with "selenium/standalone-chromium:latest" when using "options.addArguments("--headless=new");". Without headless or when using the old headless mode "options.addArguments("--headless");", it works as expected all the time though. Unable to find a pattern here.

pujagani commented 6 days ago

With the demo repo shared, I am able to see the error described in the issue. But those are two different things. I was trying to run tests on my machine pointing to the docker-selenium grid and was not able to reproduce the issue accurately on the last attempt. But the repo is trying to run tests inside the docker container locally without using the Grid. I have a feeling this is not a Selenium issue.

pujagani commented 6 days ago

In the demo repo shared I have made the following updates:

Updated selenium to latest version :

testImplementation("org.seleniumhq.selenium:selenium-java:4.24.0")

Updated the chromium and chromedriver versions in the dockerfile:

RUN apk add chromium>128.0.6613.119-r0 chromium-chromedriver>128.0.6613.119-r0

After this, I no longer see the error. Sharing the output below:

Caching disabled for task ':test' because:
  Build cache is disabled
Task ':test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 1'. Working directory: /root Command: /opt/java/openjdk/bin/java -Dorg.gradle.internal.worker.tmpdir=/root/build/tmp/test/work @/root/.gradle/.tmp/gradle-worker-classpath15735176167891063660txt -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -ea worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test Executor 1'
Successfully started process 'Gradle Test Executor 1'

Gradle Test Executor 1 started executing tests.
Gradle Test Executor 1 finished executing tests.

> Task :test
Finished generating test XML results (0.005 secs) into: /root/build/test-results/test
Generating HTML test report...
Finished generating test html results (0.011 secs) into: /root/build/reports/tests/test

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 17s
2 actionable tasks: 1 executed, 1 up-to-date

pujagani commented 6 days ago

@1dEraNCeSIv0 Can you please try it and provide an update?

1dEraNCeSIv0 commented 6 days ago

I've incorporated your changes into the repository, no changes. Feel free to check if I made an error when editing the project.

Caching disabled for task ':test' because:
  Build cache is disabled
Task ':test' is not up-to-date because:
  No history is available.
Starting process 'Gradle Test Executor 1'. Working directory: /root Command: /opt/java/openjdk/bin/java -Dorg.gradle.internal.worker.tmpdir=/root/build/tmp/test/work @/root/.gradle/.tmp/gradle-worker-classpath6100839871221666423txt -Xmx512m -Dfile.encoding=UTF-8 -Duser.country=US -Duser.language=en -Duser.variant -ea worker.org.gradle.process.internal.worker.GradleWorkerMain 'Gradle Test Executor 1'
Successfully started process 'Gradle Test Executor 1'

SeleniumTest > headlessNew() FAILED
    org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Possible causes are invalid address of the remote server or browser start-up failure.
    Host info: host: '42d9f1c3b18f', ip: '172.17.0.2'
        at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:563)
        at app//org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:245)
        at app//org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:174)
        at app//org.openqa.selenium.chromium.ChromiumDriver.<init>(ChromiumDriver.java:114)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:88)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:83)
        at app//org.openqa.selenium.chrome.ChromeDriver.<init>(ChromeDriver.java:72)
        at app//SeleniumTest.createChromeDriver(SeleniumTest.java:26)
        at app//SeleniumTest.headlessNew(SeleniumTest.java:9)

        Caused by:
        org.openqa.selenium.TimeoutException: java.util.concurrent.TimeoutException
        Build info: version: '4.24.0', revision: '748ffc9bc3'
        System info: os.name: 'Linux', os.arch: 'amd64', os.version: '5.15.153.1-microsoft-standard-WSL2', java.version: '21.0.4'
        Driver info: driver.version: ChromeDriver
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:418)
            at app//org.openqa.selenium.remote.http.AddSeleniumUserAgent.lambda$apply$0(AddSeleniumUserAgent.java:42)
            at app//org.openqa.selenium.remote.http.Filter.lambda$andFinally$1(Filter.java:55)
            at app//org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute(JdkHttpClient.java:374)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:89)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:75)
            at app//org.openqa.selenium.remote.ProtocolHandshake.createSession(ProtocolHandshake.java:61)
            at app//org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:162)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.invokeExecute(DriverCommandExecutor.java:216)
            at app//org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:174)
            at app//org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:545)
            ... 8 more

            Caused by:
            java.util.concurrent.TimeoutException
                at java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1960)
                at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2095)
                at org.openqa.selenium.remote.http.jdk.JdkHttpClient.execute0(JdkHttpClient.java:401)
                ... 18 more

Gradle Test Executor 1 finished executing tests.

> Task :test FAILED

2 tests completed, 1 failed
Finished generating test XML results (0.01 secs) into: /root/build/test-results/test
Generating HTML test report...
Finished generating test html results (0.017 secs) into: /root/build/reports/tests/test

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: file:///root/build/reports/tests/test/index.html

* Try:
> Run with --scan to get full insights.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.
BUILD FAILED in 3m 29s

For more on this, please refer to https://docs.gradle.org/8.10/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.
2 actionable tasks: 1 executed, 1 up-to-date

FWIW I also checked the specific installed chromium version when I run the image and it's the one you used

~ # apk list -i | grep chrome
chromium-chromedriver-128.0.6613.119-r0 x86_64 {chromium} (BSD-3-Clause) [installed]

pujagani commented 6 days ago

Thank you for trying. I am not sure how to help further since I am unable to reproduce it consistently on my end.

1dEraNCeSIv0 commented 3 days ago

I've finally found that one of my colleagues can run the tests and they consistently work for them as well. I'll be digging more into that next week, hopefully I'll be able to narrow down the exact causes of the error. I'll let you know once I find out more

SeleniumHQ / selenium