Chrome not reachable exception while running selenium tests inside aws lambda

avikha commented 3 years ago

🐛 Bug Report

We are seeing chrome not reachable exception while running selenium ui tests on aws lambda.

I wonder if some one else met this problem? Solved it? Worked around it? Maybe met similar problems on dockers or virtual machines?

Any body that has any suggestion of how to continue and explore this problem, please share. Also, if there is any new information that I can supply or provide, please let me know.

Disclaimer: About five months ago, I published a similar bug on chromium bug list. Unfortunately, no happy news there. So I am trying my luck here as well.

To Reproduce

Step 1: Create AWS account, and then create the AWS lambda running selenium. For more information you can read docs docs

Step 2: Then you can run the following test

@Test
public void appleAirPodsTest() {
    navigateToURL("https://www.apple.com/airpods-pro/");

    for(int i = 0; i < 10000; i++) {
        scrollDownBy(0, 10);
    }
}

Or

@Test
public void appleIphone12Test() {
    navigateToURL("https://www.apple.com/iphone-12-pro/");

    for(int i = 0; i < 10000; i++) {
        scrollDownBy(0, 10);
    }
}

Expected behavior

We expected the tests to finish the run, the driver to be terminated, the logs to be printed (published), and everyone to be happy ever since. However, the reality is slightly different - it occurs that the test "fails" right after 60-90 seconds into the run.

We get the following message

org.openqa.selenium.WebDriverException: chrome not reachable
  (Session info: headless chrome=89.0.4389.0)
Build info: version: 'unknown', revision: 'unknown', time: 'unknown'
System info: host: '169.254.110.181', ip: '169.254.110.181', os.name: 'Linux', os.arch: 'amd64', os.version: '4.14.214-164.339.amzn2.x86_64', java.version: '1.8.0_201'
Driver info: org.openqa.selenium.chrome.ChromeDriver
Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 89.0.4389.0, chrome: {chromedriverVersion: 89.0.4389.23 (61b08ee2c5002..., userDataDir: /tmp/user-data}, goog:chromeOptions: {debuggerAddress: localhost:35123}, javascriptEnabled: true, networkConnectionEnabled: false, pageLoadStrategy: normal, platform: LINUX, platformName: LINUX, proxy: Proxy(), setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:largeBlob: true, webauthn:virtualAuthenticators: true}
Session ID: 149ae90b4fb8c1094acda489b3e86214
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.createException(W3CHttpResponseCodec.java:187)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:122)
    at org.openqa.selenium.remote.http.W3CHttpResponseCodec.decode(W3CHttpResponseCodec.java:49)
    at org.openqa.selenium.remote.HttpCommandExecutor.execute(HttpCommandExecutor.java:158)
    at org.openqa.selenium.remote.service.DriverCommandExecutor.execute(DriverCommandExecutor.java:83)
    at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:552)
    at org.openqa.selenium.remote.RemoteWebDriver.executeScript(RemoteWebDriver.java:489)
    at com.duda.webdriver.perfTestsSites.AppleIphone12Test.scrollDownBy(AppleIphone12Test.java:17)
    at com.duda.webdriver.perfTestsSites.AppleIphone12Test.appleIphone12Test(AppleIphone12Test.java:25)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.testng.internal.MethodInvocationHelper.invokeMethod(MethodInvocationHelper.java:108)
    at org.testng.internal.Invoker.invokeMethod(Invoker.java:661)
    at org.testng.internal.Invoker.invokeTestMethod(Invoker.java:869)
    at org.testng.internal.Invoker.invokeTestMethods(Invoker.java:1193)
    at org.testng.internal.TestMethodWorker.invokeTestMethods(TestMethodWorker.java:126)
    at org.testng.internal.TestMethodWorker.run(TestMethodWorker.java:109)
    at org.testng.TestRunner.privateRun(TestRunner.java:744)
    at org.testng.TestRunner.run(TestRunner.java:602)
    at org.testng.SuiteRunner.runTest(SuiteRunner.java:380)
    at org.testng.SuiteRunner.runSequentially(SuiteRunner.java:375)
    at org.testng.SuiteRunner.privateRun(SuiteRunner.java:340)
    at org.testng.SuiteRunner.run(SuiteRunner.java:289)
    at org.testng.SuiteRunnerWorker.runSuite(SuiteRunnerWorker.java:52)
    at org.testng.SuiteRunnerWorker.run(SuiteRunnerWorker.java:86)
    at org.testng.TestNG.runSuitesSequentially(TestNG.java:1301)
    at org.testng.TestNG.runSuitesLocally(TestNG.java:1226)
    at org.testng.TestNG.runSuites(TestNG.java:1144)
    at org.testng.TestNG.run(TestNG.java:1115)
    at com.duda.webdriver.elduderino.core.Main.runTest(Main.java:85)
    at com.duda.webdriver.elduderino.core.Main.machineMain(Main.java:55)
    at com.duda.webdriver.elduderino.core.Main.main(Main.java:24)

Sometimes, the chrome also outputs the following message

[1027/164354.465008:FATAL:memory.cc(40)] Out of memory. size=262144

More info

Not consistent problem

It is an interesting fact that the problem is not a consistent one, but rather statistical. For instance, running the tests above 30 times may result in 12 chrome not reachable failures and 18 successes. There are tests where some of them will file due to crashes for 5/30 and some 20/30.

The memory issue

We struggle to understand what memory runs out? At first, we thought that it might be the RAM that is used by Lambda. So, what we did is to increase the lambda to MAX of 3Gb RAM - no help, still same problems.

We also noticed that the max memory usage of the lambda is approx 1.1Gb - chrome, chromedriver, java (general), and java tests/infra. Everything that runs on lambda. So we concluded that it is probably not the lambda memory issues.

Local/BrowserStack Runs

We could not recreate the problem running chrome heedlessly/full on "regular" local computer. We tried on Linux, Windows, and Macs. No success there. The same goes for BrowserStack - we can run the same test on the browser stack 30 times and it will pass. No crashes there.

Chrome versions tested and test relations

First, we met the problem of migrating from chrome 69 to 74. At first, we didn't give it much attention because it was very scarce.
Seven months ago we migrated to 84. By doing it, we began to see it much more often. We noticed that tests with multi-tabs, or multi navigation, or tests with multi drivers running at the same time, have a greater probability to fail. We assumed that those are intense and heavy, and probably cause that problem. So what we did is refactoring part of our "problematic tests" to fit 84. Partially, it helped. Worthy to note that not all tests involving multi-tab, multi navigation, or multi drivers are failing due to this problem.
Migrating from 84 to 87/88 remained +- the same as 84. So no news there.
Now we are trying to migrate to 89, and it is too hard ... we are seeing these crashes hundreds of times each run on our ci/cd.

We also thought that there might be some relevance to some of our tests. However, we saw that on each browser version and each run there are different type of tests that fail. So no consistency there.

Environment

Operating System

Name: Linux Architecture: amd64 OS Version: 4.14.214-164.339.amzn2.x86_64

Browser

Vendor: Chromium Version: >= 79

Selenium/Language

Java: 1.8.0_201 Binding version: 3.141.0

Chrome flags

"--headless",
"--disable-gpu",
"--no-sandbox",
"--disable-dev-shm-usage",
"--user-data-dir=/tmp/user-data",
"--enable-logging",
"--v=1",
"--single-process",
"--data-path=/tmp/data-path",
"--homedir=/tmp",
"--disk-cache-dir=/tmp/cache-dir",
"--enable-precise-memory-info"

ghost commented 3 years ago

👋 Hi there! Thank you for creating this issue.

I am the Selenium Assistant Bot 🤖, I triage issues in this repository. If I can't do it, I label it to help maintainers identify issues that need triaging.

I am an Open Source project 🙌, post bugs or ideas here!

ghost commented 3 years ago

❗️ It seems this issue is not using any of the supported templates

💡 Supported issue types are (they start with):

🐛 Bug Report (bugs found in a recent release)
🚀 Feature Proposal (a useful feature you would like to propose)
💥 Regression Report (a supported feature is not working anymore)

Issue templates help this project to stay in shape, please use them and fill them out completely. By doing that you are helping the project because the community and maintainers can provide prompt feedback, and potentially solve the issue.

If you are asking a question, a better way to address this is:

📫 Send questions and support requests to the Selenium user group
📮 Post them to StackOverflow
🗣 Join us in the IRC/Slack channel where the community can help you as well

If you think this is incorrect, please feel free to open a new issue.

Thank you for your contributions.

SeleniumHQ / selenium