'Chrome not reachable' error when running selenium in AWS Lambda

🐛 Bug Report

We are seeing chrome not reachable exception while running selenium ui tests on aws lambda.

I also published a similar bug on chromium bug list. Unfortunately, there is still no happy news there.

To Reproduce

Step 1: Create AWS account, and then create the AWS lambda running selenium. Similar to what you can find here

Step 2: Then you can run the following test

@Test
public void appleAirPodsTest() {
    navigateToURL("https://www.apple.com/airpods-pro/");

    for(int i = 0; i < 10000; i++) {
        scrollDownBy(0, 10);
    }
}

@Test
public void appleIphone12Test() {
    navigateToURL("https://www.apple.com/iphone-12-pro/");

    for(int i = 0; i < 10000; i++) {
        scrollDownBy(0, 10);
    }
}

The Apple sites were chosen because they are rich, beautiful, interactive, and contain a lot of animations and modern techniques. However, this problem happens in our internal platform, not related to Apple at all. And, not related to scroll. It happens in all different test cases.

Expected behavior

We expected the tests to finish the run, the driver to be terminated, the logs to be printed (published), and everyone to be happy ever since. However, the reality is slightly different - it occurs that the test "fails" right after 60-90 seconds into the run.

We get the following message

org.openqa.selenium.WebDriverException: chrome not reachable
  (Session info: headless chrome=89.0.4389.0)
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: '169.254.110.181', ip: '169.254.110.181', os.name: 'Linux', os.arch: 'amd64', os.version: '4.14.214-164.339.amzn2.x86_64', java.version: '1.8.0_201'
Driver info: org.openqa.selenium.chrome.ChromeDriver
Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 89.0.4389.0, chrome: {chromedriverVersion: 89.0.4389.23 (61b08ee2c5002..., userDataDir: /tmp/user-data}, goog:chromeOptions: {debuggerAddress: localhost:35123}, javascriptEnabled: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:largeBlob: true, webauthn:virtualAuthenticators: true}
Session ID: 149ae90b4fb8c1094acda489b3e86214
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.createException(W3CHttpResponseCodec.java:187)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:122)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:49)
    at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:158)
    at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
    at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
    at org.openqa.selenium.remote.RemoteWebDriver.executeScript(RemoteWebDriver.java:489)
    at com.duda.webdriver.perfTestsSites.AppleIphone12Test.scrollDownBy(AppleIphone12Test.java:17)
    at com.duda.webdriver.perfTestsSites.AppleIphone12Test.appleIphone12Test(AppleIphone12Test.java:25)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:108)
    at org.testng.internal.Invoker.invokeMethod(Invoker.java:661)
    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:869)
    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1193)
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:126)
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
    at org.testng.TestRunner.privateRun(TestRunner.java:744)
    at org.testng.TestRunner.run(TestRunner.java:602)
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:380)
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:375)
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
    at org.testng.SuiteRunner.run(SuiteRunner.java:289)
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1301)
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1226)
    at org.testng.TestNG.runSuites(TestNG.java:1144)
    at org.testng.TestNG.run(TestNG.java:1115)
    at com.duda.webdriver.elduderino.core.Main.runTest(Main.java:85)
    at com.duda.webdriver.elduderino.core.Main.machineMain(Main.java:55)
    at com.duda.webdriver.elduderino.core.Main.main(Main.java:24)

Sometimes, the chrome also outputs the following message

[1027/164354.465008:FATAL:memory.cc(40)] Out of memory. size=262144

More info

Not consistent problem

It is an interesting fact that the problem is not a consistent one, but rather statistical. For instance, running the tests above 30 times may result in 12 chrome not reachable failures and 18 successes. There are tests where some of them will fail due to crashes for 5/30 and some 20/30.

The memory issue

We struggle to understand what memory runs out? At first, we thought that it might be the RAM that is used by Lambda. So, what we did is to increase the lambda to MAX of 3Gb RAM - no help, still same problems.

We also noticed that the max memory usage of the lambda is approx 1.1Gb - chrome, chromedriver, java (general), and java tests/infra. Everything that runs on lambda. So we concluded that it is probably not the lambda memory issues.

Local/BrowserStack Runs

We could not recreate the problem running chrome heedlessly/full on "regular" local computer. We tried on Linux, Windows, and Macs. No success there. The same goes for BrowserStack - we can run the same test on the browser stack 30 times and it will pass. No crashes there.

Chrome versions tested and test relations

First, we met the problem of migrating from chrome 69 to 74. At first, we didn't give it much attention because it was very scarce.
Seven months ago we migrated to 84. By doing it, we began to see it much more often. We noticed that tests with multi-tabs, or multi navigation, or tests with multi drivers running at the same time, have a greater probability to fail. We assumed that those are intense and heavy, and probably cause that problem. So what we did is refactoring part of our "problematic tests" to fit 84. Partially, it helped. Worthy to note that not all tests involving multi-tab, multi navigation, or multi drivers are failing due to this problem.
Migrating from 84 to 87/88 remained +- the same as 84. So no news there.
Now we are trying to migrate to 89, and it is too hard ... we are seeing these crashes hundreds of times each run on our ci/cd.

We also thought that there might be some relevance to some of our tests. However, we saw that on each browser version and each run there are different type of tests that fail. So no consistency there.

Environment

Operating System

Name: Linux Architecture: amd64 OS Version: 4.14.214-164.339.amzn2.x86_64

Browser

Vendor: Chromium Version: >= 79

Selenium/Language

Java: 1.8.0_201 Binding version: 3.141.0

Chrome flags

"--headless",
"--disable-gpu",
"--no-sandbox",
"--disable-dev-shm-usage",
"--user-data-dir=/tmp/user-data",
"--enable-logging",
"--v=1",
"--single-process",
"--data-path=/tmp/data-path",
"--homedir=/tmp",
"--disk-cache-dir=/tmp/cache-dir",
"--enable-precise-memory-info"

SeleniumHQ / selenium