angular / protractor

E2E test framework for Angular apps
http://www.protractortest.org
MIT License
8.75k stars 2.31k forks source link

protractor hangs indefinitely on "[launcher] Running 1 instances of WebDriver" - when running with xvfb #2419

Open borisdiakur opened 9 years ago

borisdiakur commented 9 years ago

At first glance this issue seems to be related with #1764 but I don't see how a network issue can cause the hanging here (directConnect is set to true). So here is the setup:

protractor-config.js

'use strict';

/* globals jasmine, browser */
exports.config = {
    allScriptsTimeout: 300000, // 5 min

    // Spec patterns are relative to this directory.
    specs: [
        'protractor/*.js'
    ],

    capabilities: {
        'browserName': 'chrome',
        version: '',
        platform: 'ANY'
    },

    directConnect: true,

    baseUrl: 'http://localhost:3000/', // gets overwritten

    troubleshoot: true,

    framework: 'jasmine',
    onPrepare: function beforeProtractorRuns() {
        browser.driver.manage().window().setSize(1280, 1024);

        // Disable animations so e2e tests run more quickly
        var disableNgAnimate = function () {
            angular.module('disableNgAnimate', []).run([
                '$animate',
                function ($animate) {
                    $animate.enabled(false);
                }
            ]);
        };
        browser.addMockModule('disableNgAnimate', disableNgAnimate);

        // see: https://github.com/angular/protractor/issues/60
        require('jasmine-reporters');
        var capsPromise = browser.getCapabilities();
        capsPromise.then(function setOutputDir(caps) {
            var browserName = caps.caps_.browserName.toUpperCase();
            var browserVersion = caps.caps_.version;
            var filePrefix = browserName + '-' + browserVersion + '-';
            jasmine.getEnv().addReporter(
                new jasmine.JUnitXmlReporter(
                    'build/reports/testresults',
                    true, //consolidate: save nested describes within the same file as their parent
                    true, //useDotNotation: separate suite names with dots rather than spaces (ie "Class.init" not "Class init")
                    filePrefix
                )
            );
        });
    },

    jasmineNodeOpts: {
        isVerbose: true,
        showColors: true,
        includeStackTrace: true,
        defaultTimeoutInterval: 300000, // 5 min
    }
};

Here is an extract of the essential parts of the Jenkins Job console log:

⋮
13:28:22 DEBUG - Running with --troubleshoot
13:28:22 DEBUG - Protractor version: 2.1.0
13:28:22 DEBUG - Your base url for tests is http://localhost:57505/
13:28:22 Using ChromeDriver directly...
13:28:22 [launcher] Running 1 instances of WebDriver
13:28:22 DEBUG - WebDriver session successfully started with capabilities { caps_: 
13:28:22    { acceptSslCerts: true,
13:28:22      applicationCacheEnabled: false,
13:28:22      browserConnectionEnabled: false,
13:28:22      browserName: 'chrome',
13:28:22      chrome: { userDataDir: '/tmp/.com.google.Chrome.AuA2qr' },
13:28:22      cssSelectorsEnabled: true,
13:28:22      databaseEnabled: false,
13:28:22      handlesAlerts: true,
13:28:22      javascriptEnabled: true,
13:28:22      locationContextEnabled: true,
13:28:22      mobileEmulationEnabled: false,
13:28:22      nativeEvents: true,
13:28:22      platform: 'Linux',
13:28:22      rotatable: false,
13:28:22      takesHeapSnapshot: true,
13:28:22      takesScreenshot: true,
13:28:22      version: '44.0.2403.125',
13:28:22      webStorageEnabled: true } }
13:28:30 Testing foobar
13:28:30   Setup
13:28:30     should display foobar - pass
⋮                                                   <--- more describe-s and it-s
13:28:35 Finished in 12.616 seconds
13:28:35 2 tests, 9 assertions, 0 failures
13:28:35 
13:28:35 [launcher] 0 instance(s) of WebDriver still running
13:28:35 [launcher] chrome #1 passed
⋮                                                   <--- other tests scenarios also run successfully
13:28:47 DEBUG - Running with --troubleshoot
13:28:47 DEBUG - Protractor version: 2.1.0
13:28:47 DEBUG - Your base url for tests is http://localhost:34976/
13:28:47 Using ChromeDriver directly...
13:28:47 [launcher] Running 1 instances of WebDriver <--- hanging from here,
                                                          can't figure out why,
                                                          wish I had logs : [
15:05:23 Xvfb stopping                               <--- canceled build manually,
                                                          see time difference, no timeouts triggered,
                                                          neither from Protractor nor from WebDriver : [
15:05:23 Build has been canceled

Note that it is not always the same tests scenario which leads to the hanging. Same problems with node 0.10.33 as with 0.12.7. Using Ubuntu 12.04.5 LTS (GNU/Linux 3.13.0-37-generic x86_64).

$ file /opt/google/chrome/chrome
/opt/google/chrome/chrome: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.26, BuildID[sha1]=0xe4fc0257b42110fc269782274f1e774718135349, stripped
dseravalli commented 9 years ago

I opened that other issue. In my case, I was running protractor inside of Vagrant and had port 4444 forwarded in my Vagrantfile. protractor silently failed/hung in this scenario

caninemwenja commented 9 years ago

I had the same issue but it got fixed when I ran webdriver-manager start in the foreground ie

webdriver-manager start & protractor conf.js #hangs

but

webdriver-manager start protractor conf.js #works

caninemwenja commented 9 years ago

Oddly both styles seem to work now. I'm not sure but I think the nature of the tests may also determine whether protractor hangs or not. I had a test that was infinitely trying to reach a nonexistent location because I hadn't set the baseUrl. Protractor showed the same symptoms.

PaddyMann commented 9 years ago

I'm getting hanging behaviour, running tests with sharding on Codeship CI.

I run the same tests with Firefox and Chrome - I've split these into 2 separate config files but other than the browser setting they're the same.

The process sometimes hangs at the end of the Chrome file, and sometimes at the end of the Firefox one (and, if I'm very lucky, sometimes doesn't hang :))

Consistently seems to be "1 instances" left running (rather than 2 or more).

It's run as a foreground process so no help from @caninemwenja 's comment...

Callmenorm commented 9 years ago

I've got a similar issue, where it hangs at

Using ChromeDirver directly...
[launcher] Running 1 instances of WebDriver

about 50% of the time. It will hang there indefinitely. However, even if it hangs you can start other instances which may or may not succeed in the same manner.

PaddyMann commented 9 years ago

I've experienced the issue both with and without directConnect so that's not part of the issue

Callmenorm commented 9 years ago

Yeah, I've just confirmed that the same things happens with directConnect and standalone selenium.

jrust commented 9 years ago

Also seeing the problem, but only on our Ubuntu build servers. Haven't seen the issue yet on the Windows servers. Using Google Chrome 44.0.2403.130 and ChromeDriver 2.15.322448.

Callmenorm commented 9 years ago

I don't know how significant it is, but its also our Ubuntu build servers which are affected. We have fedora, ubuntu, and OSX systems, but its only the Ubuntu servers which hang at the ChromeDriver start up.

caninemwenja commented 9 years ago

Ubuntu for me too. Using Chromium 43.0.2357.130 and ChromeDriver 2.16. It works now though (with no apparent change).

PaddyMann commented 9 years ago

We're on Ubuntu too (- using Codeship and their servers are based on Ubuntu Trusty)

borisdiakur commented 9 years ago

We're on Ubuntu as well (updated issue description with system information "Ubuntu 12.04.5 LTS (GNU/Linux 3.13.0-37-generic x86_64)").

Callmenorm commented 9 years ago

The ubuntu servers also use Xvfb (X Virtual Frame Buffer) to use the browser without a window manager.

NickTomlin commented 9 years ago

Interesting; i've experienced some weird hang ups on Debian (wheezy) when using Xvfb. They seemed to be due to the browser not being able to attach to the headless display. Is everyone here attempting to run things via Xvfb?

tullmann commented 9 years ago

If you're having problems racing with Xvfb startup try using xdpyinfo to wait for X to be ready:

MAX=120 # About 60 seconds
CT=0
while ! xdpyinfo >/dev/null 2>&1; do
    sleep 0.50s
    CT=$(( CT + 1 ))
    if [ "$CT" -ge "$MAX" ]; then
        LOG "FATAL: $0: Gave up waiting for X server $DISPLAY"
        exit 11
    fi
done

LOG "X is available"

Also, you can get more logs from chromedriver itself. Those often have more details about what went wrong (but often don't have enough useful info, either). Often we have to look into the chrome debug logs, too. See http://stackoverflow.com/questions/31662828/how-to-access-chromedriver-logs-for-protractor-test for suggestions on how to get additional chromedriver logs.

jrust commented 9 years ago

Yup, using xvfb-run on ubuntu and (obviously) not using it on the windows build servers so that sounds like a definite possibility. @tullmann do you use your script in conjunction with xvfb-run or Xvfb?

Callmenorm commented 9 years ago

We have put xvfb in our init scripts, so as far as I know it is running soon after startup. Unless there is something else that needs to be done to make sure that it's "Ready" to accept connections, then I think xvfb is "ready" in my case.

borisdiakur commented 9 years ago

@tullmann We also have xvfb running from the beginning and some of the tests run fine before the hanging. But thanks for the chromedriver logs hint. Will try that as soon as possible.

ArwinLin commented 9 years ago

I'm using windows server 2012. If I run webdriver-manager start in the foreground there is no problem. But if I run it in the background, similar "hanging behavior" will happen- about 1.5 minute for each browser. I also try standalone selenium server so that I can skip the step webdriver-manager start. The test runs without problem if I start it locally. But if I remotely use psexec on other machine to run the test, it will run in background and the hanging behavior will happen again.

borisdiakur commented 9 years ago

I managed to enable chrome driver logs (using this method).
Here is the result:

[0,014][INFO]: COMMAND InitSession {
   "desiredCapabilities": {
      "browserName": "chrome",
      "count": 1,
      "platform": "ANY",
      "version": ""
   }
}
[0,014][INFO]: Populating Preferences file: {
   "alternate_error_pages": {
      "enabled": false
   },
   "autofill": {
      "enabled": false
   },
   "browser": {
      "check_default_browser": false
   },
   "distribution": {
      "import_bookmarks": false,
      "import_history": false,
      "import_search_engine": false,
      "make_chrome_default_for_user": false,
      "show_welcome_page": false,
      "skip_first_run_ui": true
   },
   "dns_prefetching": {
      "enabled": false
   },
   "profile": {
      "content_settings": {
         "pattern_pairs": {
            "https://*,*": {
               "media-stream": {
                  "audio": "Default",
                  "video": "Default"
               }
            }
         }
      },
      "default_content_settings": {
         "geolocation": 1,
         "mouselock": 1,
         "notifications": 1,
         "popups": 1,
         "ppapi-broker": 1
      },
      "password_manager_enabled": false
   },
   "safebrowsing": {
      "enabled": false
   },
   "search": {
      "suggest_enabled": false
   },
   "translate": {
      "enabled": false
   }
}
[0,014][INFO]: Populating Local State file: {
   "background_mode": {
      "enabled": false
   },
   "ssl": {
      "rev_checking": {
         "enabled": false
      }
   }
}
[0,015][INFO]: Launching chrome: /opt/google/chrome/google-chrome --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-monitor --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-logging --ignore-certificate-errors --load-extension=/tmp/.com.google.Chrome.NqAJ9w/internal --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=12199 --safebrowsing-disable-auto-update --safebrowsing-disable-download-protection --test-type=webdriver --use-mock-keychain --user-data-dir=/tmp/.com.google.Chrome.LqJPTu data:,
[0,016][DEBUG]: DevTools request: http://127.0.0.1:12199/json/version
[0,077][DEBUG]: DevTools request failed
[1:1:0909/154445:ERROR:image_metadata_extractor.cc(111)] Couldn't load libexif.
Xlib:  extension "RANDR" missing on display ":2".
[0,127][DEBUG]: DevTools request: http://127.0.0.1:12199/json/version
[0,128][DEBUG]: DevTools request failed
[0,178][DEBUG]: DevTools request: http://127.0.0.1:12199/json/version
[0,179][DEBUG]: DevTools request failed
Xlib:  extension "RANDR" missing on display ":2".
[0,229][DEBUG]: DevTools request: http://127.0.0.1:12199/json/version
[0,230][DEBUG]: DevTools request failed
[58787:58787:0909/154445:ERROR:sandbox_linux.cc(345)] InitializeSandbox() called with multiple threads in process gpu-process
[0,280][DEBUG]: DevTools request: http://127.0.0.1:12199/json/version

In my logs folder I count the occurences in all log files:

$ grep -nr "ERROR:sandbox_linux.cc(345)] InitializeSandbox() called with multiple threads in process gpu-process" . | wc -l
161
$ grep -nr "ERROR:image_metadata_extractor.cc(111)] Couldn't load libexif." . | wc -l
161
tullmann commented 9 years ago

You can hide/avoid the "multiple threads in process gpu-process" by adding --disable-gpu to the chrome command line. Generally when running under XVFB, you're not leveraging a GPU anyway, so disabling it shouldn't be bad. Anyway, I believe this (and the RANDR and the libexif messages) are harmless messages and just distracting.

Its odd that chromedriver isn't timing out (it looks like you're waiting an hour or more?). You might get more useful information from the chrome logs about what's going on. (The "DevTools request" stuff is from chromedriver trying to establish a basic connection to chrome -- it just polls repeatedly until it gets a connection). It looks like chromedriver receives no reply, but doesn't timeout either ... might be worth comparing this log to a "normal" case to see what the differences are in your setup.)

To get more chrome debug logging add the following arguments to chrome's arguments (in your protractor config): enable-logging, v=1 and userDataDir=<somedir> where <somedir> is a new directory private to this run. (You can leave off userDataDir and chrome will pick a random directory in /tmp, but it can be annoying to figure out which one...)

borisdiakur commented 8 years ago

Seems like enabling chrome logs fixes a race condition. The build job with chrome logging enabled refuses to hang while the build job without logging still hangs on a regular basis.

jrust commented 8 years ago

Also enabled logging and still had it hang, but the log has perhaps something useful at the end:

(google-chrome:13094): GConf-WARNING **: Client failed to connect to the D-BUS daemon:
//bin/dbus-launch terminated abnormally without any error message
[13132:13132:0929/003546:ERROR:sandbox_linux.cc(345)] InitializeSandbox() called with multiple threads in process gpu-process

The gpu-process issue has been discussed earlier, but could the d-bus message be related?

juliemr commented 8 years ago

I'm hoping that this is fixed with the new version of chromedriver in Protractor 2.3.0 and higher. Can anyone confirm?

jrust commented 8 years ago

@juliemr we've had it running for a few days now and a few hundred builds on 2.4 and haven't seen it hang, so yes, that seems to have fixed it.

juliemr commented 8 years ago

Whee! Closing - please open up a new issue if this crops up again.

borisdiakur commented 8 years ago

I'm sorry for the late reply and also sorry for letting you know that the hanging persists with protractor 2.4. Shall I really open up a new issue even if we still do not know what is really the root cause of the problem?

PaddyMann commented 8 years ago

I've also had hanging since 2.4. Oddly, I had had an extended period without any hanging before it happened a few times last week.

balaarunreddyv1 commented 8 years ago

This issue still occurs. on protractor 2.5.1

jas13 commented 8 years ago

I can confirm that this issues still occurs on most recent versions of Protractor (used versions corresponding to updates for chromedriver & webdriver).

I've been having this problem running on an Ubuntu CI environment with parallel machines ("containers") for each build. Recently ran with --troubleshoot and observed the following.

Output for a successful container:

DEBUG - Running with --troubleshoot
DEBUG - Protractor version: 1.8.0
DEBUG - Your base url for tests is undefined
Using ChromeDriver directly...
[launcher] Running 1 instances of WebDriver
DEBUG - WebDriver session successfully started with capabilities { caps_: 
...

Output for a hanging container:

DEBUG - Running with --troubleshoot
DEBUG - Protractor version: 1.8.0
DEBUG - Your base url for tests is undefined
Using ChromeDriver directly...

[launcher] Running 1 instances of WebDriver command protractor protractor.conf.js --troubleshoot --suite=container_suite took more than 10 minutes since last output

It appears that a webdriver session never gets started. @juliemr, can this issue be reopened?

jas13 commented 8 years ago

So we were able to ssh in to both a successful container and a hanging container simultaneously and view running processes. This is the output of running ps auxwf.

Successful container:

ubuntu    17473  2.0  0.0  85024  5988 ?        S    15:51   0:26  |   \_ sshd: ubuntu@pts/0  
ubuntu    33618  0.0  0.0  14820  1508 pts/0    Ss+  16:03   0:00  |       \_ /bin/bash ./circle_scripts/test_override.sh
ubuntu    33934  6.7  0.0 728856 85596 pts/0    Rl+  16:03   0:39  |           \_ node /home/ubuntu/nvm/v0.10.33/bin/protractor protractor.conf.js --suite=container_suite
ubuntu    33938  1.8  0.0 378332 10212 pts/0    Sl+  16:03   0:10  |               \_ /home/ubuntu/nvm/v0.10.33/lib/node_modules/protractor/selenium/chromedriver --port=48603
ubuntu    33941  8.1  0.0 710400 90864 pts/0    Sl+  16:03   0:47  |                   \_ /opt/google/chrome/chrome --disable-setuid-sandbox --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-mo
ubuntu    33949  0.0  0.0   9664   620 pts/0    S+   16:03   0:00  |                       \_ cat
ubuntu    33950  0.0  0.0   9664   616 pts/0    S+   16:03   0:00  |                       \_ cat
ubuntu    33953  0.0  0.0 341532 28224 pts/0    S+   16:03   0:00  |                       \_ /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1GMsD3
ubuntu    33954  0.0  0.0  28012  1972 pts/0    S+   16:03   0:00  |                       |   \_ /opt/google/chrome/nacl_helper
ubuntu    33958  0.0  0.0 341532  8064 pts/0    S+   16:03   0:00  |                       |   \_ /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1GMsD3
ubuntu    34001 36.0  0.1 938400 204744 pts/0   Rl+  16:03   3:29  |                       |       \_ /opt/google/chrome/chrome --type=renderer --enable-logging --log-level=0 --test-type=webdriver --lang=en-US --user-data-dir=/tmp/.com.google.Chrome.1GMsD3 --disable-client-side-ph
ubuntu    34025  0.0  0.0 761540 35692 pts/0    Sl+  16:03   0:00  |                       |       \_ /opt/google/chrome/chrome --type=renderer --enable-logging --log-level=0 --test-type=webdriver --lang=en-US --user-data-dir=/tmp/.com.google.Chrome.1GMsD3 --extension-process --en
ubuntu    33994  0.0  0.0 434308 36752 pts/0    Sl+  16:03   0:00  |                       \_ /opt/google/chrome/chrome --type=gpu-process --channel=33941.0.931696333 --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1GMsD3 --supports-dual-gpus=false --gpu-dri

Hanging container:

ubuntu     4408  2.2  0.0  85472  7140 ?        S    15:48   0:27  |   \_ sshd: ubuntu@pts/0  
ubuntu    20741  0.0  0.0  14820  1508 pts/0    Ss+  15:59   0:00  |       \_ /bin/bash ./circle_scripts/test_override.sh
ubuntu    21057  1.1  0.0 670188 36728 pts/0    Sl+  15:59   0:06  |           \_ node /home/ubuntu/nvm/v0.10.33/bin/protractor protractor.conf.js --suite=container_suite
ubuntu    21061  0.0  0.0 378208  6560 pts/0    Sl+  15:59   0:00  |               \_ /home/ubuntu/nvm/v0.10.33/lib/node_modules/protractor/selenium/chromedriver --port=56618
ubuntu    21064  0.0  0.0 556412 47968 pts/0    Sl+  15:59   0:00  |                   \_ /opt/google/chrome/chrome --disable-setuid-sandbox --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-hang-mo
ubuntu    21072  0.0  0.0   9664   616 pts/0    S+   15:59   0:00  |                       \_ cat
ubuntu    21073  0.0  0.0   9664   620 pts/0    S+   15:59   0:00  |                       \_ cat
ubuntu    21076  0.0  0.0 341532 28224 pts/0    S+   15:59   0:00  |                       \_ /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1bhwCh
ubuntu    21077  0.0  0.0  28012  1964 pts/0    S+   15:59   0:00  |                       |   \_ /opt/google/chrome/nacl_helper
ubuntu    21080  0.0  0.0 341532  7764 pts/0    S+   15:59   0:00  |                       |   \_ /opt/google/chrome/chrome --type=zygote --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1bhwCh
ubuntu    21117  0.0  0.0 434308 36912 pts/0    Sl+  15:59   0:00  |                       \_ /opt/google/chrome/chrome --type=gpu-process --channel=21064.0.776113428 --enable-logging --log-level=0 --user-data-dir=/tmp/.com.google.Chrome.1bhwCh --supports-dual-gpus=false --gpu-dri
ubuntu    21118  0.2  0.0 556412 13244 pts/0    S+   15:59   0:01  |                       \_ /opt/google/chrome/chrome --disable-setuid-sandbox --disable-background-networking --disable-client-side-phishing-detection --disable-component-update --disable-default-apps --disable-han

The only discernible difference is the preference of chrome instances with type=renderer in the successful container.

tullmann commented 8 years ago

If you're running into problems with chrome occasionally hanging at startup, its probably a Chrome bug. See https://code.google.com/p/chromium/issues/detail?id=309093 where Google ran into this problem in their chrome testing setup, and worked around it in their test infrastructure.

Basically, chrome is mulit-threaded, and relies on some standard gconf libraries, one of those libraries does a fork+exec to start up "dbus" if it is not already running. Doing a fork+exec in a multi-threaded application is bad because you will occasionally fork while a different thread has the malloc lock (or other critical lock) and the child will then deadlock when it tries to acquire that lock, and everything grinds to a halt. Generally on a desktop, dbus is already running, but in many stripped-down test environments dbus does not get started.

Our work-around is to make sure dbus is running by having our scripts launch that XVFB run its child processes via dbus-launch --exit-with-session. (We also have a script that polls for X to be ready before proceeding, that seems to have helped, but we're less confident its strictly necessary).

Here's a lightly modified version of the xvfb wrapper script we use that starts xvfb, dbus, and waits for X to be ready: https://gist.github.com/tullmann/2d8d38444c5e81a41b6d

And here's the waitForX script that depends on: https://gist.github.com/tullmann/476cc71169295d5c3fe6

Callmenorm commented 8 years ago

@tullmann How do you use those scripts? Do you start up xvfb before every run? Or do you have those as part of startup scripts? Are they just wrappers around Xvfb?

tullmann commented 8 years ago

@Callmenorm We use the bb-xfvb script to wrap each call to the protractor script in a private XVFB instance. So if you normally run protractor foo.js, you can do bb-xvfb protractor foo.js to run it under an XVFB instance. (The script is just a wrapper around xvfb-run.)

If you're starting XVFB (or a real X server) in some other way (well before you get around to starting protractor), you will want to use the "dbus-launch --exit-with-session" and/or waitForX scripts as necessary in your environment.

borisdiakur commented 8 years ago

Thanks @tullmann! We will give it a try in the next sprint. Update: Yep, the scripts seem to do the trick. Anyway, protractor should timeout in case of a deadlock during dbus startup.

PaddyMann commented 8 years ago

Thank you @tullmann !

I have just integrated your scripts as part of our testing on Codeship. It's now run once, and for the first time in ages all of the tests finished :)

I'll monitor it over the coming week and will shout if any issues, but in the meantime: THANK YOU :)

jas13 commented 8 years ago

Likewise @tullmann's scripts have solved our CI builds timing out randomly. Thanks!

Callmenorm commented 8 years ago

@tullmann, you're a hero.

Sabartius commented 8 years ago

we also have problems with hanging chrome browsers, running in docker containers, which are used as ci-agents (teamcity). When the container is started, it also starts xvfb as a service and runs for several days. Now Protractor starts Chrome himself with the "--directConnect=true" option and also starts several browsers in a single test. Some builds run smoothly, some hang indefinitly. @tullmann any idea how i can integrate your scripts?

tullmann commented 8 years ago

@Sabartius make sure when you start xvfb that you also start dbus. I don't think my scripts are specifically useful in your scenario, so you'll need to figure out a different way to make sure dbus is running. Its probably as simple as just having your container run dbus-launch in the right place (see the man page for more details).

PaddyMann commented 8 years ago

Aww - after 2 weeks without an issue, my last 3 builds have hung.

Haven't yet had time to investigate why. Anyone else seen this @tullmann @jas13 @Callmenorm @borisdiakur ?

PaddyMann commented 8 years ago

Seems to have fixed itself at some point this morning... very strange

amitev commented 8 years ago

@tullmann How do you get (install) dbus-launch? I have installed dbus but there isn't a dbus-launch command there. I'm using debian:jessie based docker container.

tullmann commented 8 years ago

Looks like its part of the dbus-x11 package:

# apt-cache search dbus-launch
dbus-x11 - simple interprocess messaging system (X11 deps)
dschaller commented 8 years ago

Not sure this is chrome specific. I have the same issue when running against firefox.

amitev commented 8 years ago

It probably isn't because sometimes it hangs even when using remote selenium grid.

jrharshath commented 8 years ago

I'm currently running protractor version 3.0.0, and have been seeing hanging builds specifically when running chrome (with directConnect) on Xvfb. I enabled chromedriver logs and chrome logs using the methods mentioned in this thread earlier.

I see this at the end of chrome's logs:

[9504:9504:0419/185420:ERROR:sandbox_linux.cc(338)] InitializeSandbox() called with multiple threads in process gpu-process
[9504:9504:0419/213227:ERROR:x11_util.cc(82)] X IO error received (X server probably went away)
[9407:9461:0419/213227:WARNING:channel.cc(358)] RawChannel write error

And this at the end of chromedriver's logs:

[9504:9504:0419/185420:ERROR:sandbox_linux.cc(338)] InitializeSandbox() called with multiple threads in process gpu-process
[9504:9504:0419/213227:ERROR:x11_util.cc(82)] X IO error received (X server probably went away)

We start Xvfb at the beginning of our test suite before running four end-to-end test suites - two with Firefox and two with Chrome. Often a build will have already run a few of these suites (maybe even a chrome suite) before hanging. The hanging is not as often as others report though - in my last test, one out of ten repeated builds got hung.

I'll report back on this thread after trying to run the test suites with --disable-gpu. I'll also try to locate Xvfb logs to see if something went wrong with it.

jrharshath commented 8 years ago

So.. reporting back:

Running with --disable-gpu does not fix the hanging. Also, we're running Xvfb using a simple wrapper around it with pyvirtualdisplay, and getting hold of Xvfb's output would have meant not using that wrapper, so I abandoned that approach.

I went on to run the tests inside a dbus-launch --exit-with-session wrapper, and it worked out all great.

I also came across this bug report: SeleniumHQ/docker-selenium#87. It seems to indicate that simply setting DBUS_SESSION_BUS_ADDRESS to /dev/null should prevent chrome from hanging - I'll test that approach too and report back.

jrharshath commented 8 years ago

It looks like setting DBUS_SESSION_BUS_ADDRESS to /dev/null alone is sufficient to prevent chrome from deadlocking. I ran a whole bunch of iterations of my test suite, and not a single hang.

amitev commented 8 years ago

The solution of @jrharshath worked for me. Thank you very much!