mozilla / geckodriver

WebDriver for Firefox
https://firefox-source-docs.mozilla.org/testing/geckodriver/
Mozilla Public License 2.0
7.19k stars 1.52k forks source link

Selenium unable to run multiple geckodriver processes at the same time #670

Closed OndraM closed 7 years ago

OndraM commented 7 years ago

Firefox Version

"latest" (53.0)

Platform

Linux (Travis) Selenium 3.4.0 Geckodriver 0.16.0

Selenium started in standalone mode inside Xvfb.

Steps to reproduce

1) Multiple tests are executed by Selenium server in parallel. (On Travis CI) 2) Sometimes Selenium fails with connection refused (even before new session is created) because of geckodriver error

See this build for example: https://travis-ci.org/OndraM/steward-example/builds/224527634 - there are 5 identical jobs. Sometimes Selenium server throws connection refused error and this stacktrace from Geckodriver:

Fatal error: Uncaught Facebook\WebDriver\Exception\UnknownServerException: connection refused
Build info: version: '3.4.0', revision: 'unknown', time: 'unknown'
System info: host: 'testing-docker-dddd96a0-0fd1-490d-b187-a8ff171294f6', ip: '172.17.0.8', os.name: 'Linux', os.arch: 'amd64', os.version: '4.8.12-040812-generic', java.version: '1.8.0_121'
Driver info: driver.version: FirefoxDriver
remote stacktrace: stack backtrace:
   0:           0x4f99ad - backtrace::backtrace::trace::h45ace4059cd74233
   1:           0x4f9e92 - backtrace::capture::Backtrace::new::hb5a725a088a2a2fc
   2:           0x434469 - webdriver::error::WebDriverError::new::h1643cda523229127
   3:           0x43f3fb - geckodriver::marionette::MarionetteHandler::create_connection::h3eac9ab4802e2cd0
   4:           0x442539 - <geckodriver::marionette::MarionetteHandler as webdriver::server::WebDriverHandler<geckodriver::marionette::GeckoExtensionRoute>>::handle_command::hec53c2ea4656249d
   5:           0x434924 - webdriver::server::start::{{closure}}::h in /home/travis/build/OndraM/steward-example/selenium-tests/vendor/facebook/webdriver/lib/Exception/WebDriverException.php on line 114

Facebook\WebDriver\Exception\UnknownServerException: connection refused
Build info: version: '3.4.0', revision: 'unknown', time: 'unknown'
System info: host: 'testing-docker-dddd96a0-0fd1-490d-b187-a8ff171294f6', ip: '172.17.0.8', os.name: 'Linux', os.arch: 'amd64', os.version: '4.8.12-040812-generic', java.version: '1.8.0_121'
Driver info: driver.version: FirefoxDriver
remote stacktrace: stack backtrace:
   0:           0x4f99ad - backtrace::backtrace::trace::h45ace4059cd74233
   1:           0x4f9e92 - backtrace::capture::Backtrace::new::hb5a725a088a2a2fc
   2:           0x434469 - webdriver::error::WebDriverError::new::h1643cda523229127
   3:           0x43f3fb - geckodriver::marionette::MarionetteHandler::create_connection::h3eac9ab4802e2cd0
   4:           0x442539 - <geckodriver::marionette::MarionetteHandler as webdriver::server::WebDriverHandler<geckodriver::marionette::GeckoExtensionRoute>>::handle_command::hec53c2ea4656249d
   5:           0x434924 - webdriver::server::start::{{closure}}::he5e71944552dea53
   6:           0x405f87 - std::panicking::try::do_call::h061d4025362f1291
   7:           0x5b567a - panic_unwind::__rust_maybe_catch_panic
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-musl-linux/build/src/libpanic_unwind/lib.rs:98
   8:           0x416fd7 - <F as alloc::boxed::FnBox<A>>::call_box::hf93806550e6c682e
   9:           0x5ade94 - alloc::boxed::{{impl}}::call_once<(),()>
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-musl-linux/build/src/liballoc/boxed.rs:624
                         - std::sys_common::thread::start_thread
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-musl-linux/build/src/libstd/sys_common/thread.rs:21
                         - std::sys::imp::thread::{{impl}}::new::thread_start
                        at /buildslave/rust-buildbot/slave/stable-dist-rustc-musl-linux/build/src/libstd/sys/unix/thread.rs:84 in /home/travis/build/OndraM/steward-example/selenium-tests/vendor/facebook/webdriver/lib/Exception/WebDriverException.php on line 114

I've been only able to reproduce the issue if multiple tests are executed by Selenium in parallel - when only one tests is running at a time, the problem didn't occur.

Please let me know if I can provide some more information, I'm glad to help. Also you can maybe find something more in output of the Travis build: https://travis-ci.org/OndraM/steward-example/jobs/224527635

Thanks! (And also thanks for Geckodriver!)

andreastt commented 7 years ago

My guess is that Selenium starts geckodriver on the default port, which is 4444. It should start it on port 0, have the system atomically allocate a free port, and then parse stdout for this line:

% geckodriver --port 0
1492862442487   geckodriver     INFO    Listening on 127.0.0.1:33327

We should get this fixed in Selenium so multiple geckodriver processes can be used at once.

andreastt commented 7 years ago

Which language binding are you using?

OndraM commented 7 years ago

Hi, https://github.com/facebook/php-webdriver . But I don't use Geckodriver directly, but as a RemoteDriver through Selenium server, so I suppose the JsonWire dialect should by translated to the W3C one by the server.

twalpole commented 7 years ago

@andreastt The issue with that approach is the "Listening on 127.0.0.1:33327" message is only output if log level is 'info' or more detailed. Maybe geckodriver should always output that message to make the setting of port 0 usable.

AutomatedTester commented 7 years ago

@twalpole port 0 is a OS shortcut for "give me a free port" so the logging doesnt really matter. How this is handled is down to the client binding.

closing as we can't change anything here.

OndraM commented 7 years ago

@AutomatedTester So this is an issue of Selenium server? Should it be reported here?

AutomatedTester commented 7 years ago

Yes.

Quick example of how you can have multiple geckodrivers running is doing

dburns in ~ λ geckodriver --port 0 &
[1] 11485
dburns in ~ λ 1493937230219 geckodriver INFO    Listening on 127.0.0.1:49190

dburns in ~ λ geckodriver --port 0 &
[2] 11502
1493937236036   geckodriver     INFO    Listening on 127.0.0.1:49191
twalpole commented 7 years ago

@AutomatedTester I understand that port 0 means give me a free port. However, if the actual port used isn't output to the log (which it won't be if the logging level is set to be more restrictive than INFO) it becomes useless, since the client will not be able to know which port it needs to connect on.

jgraham commented 7 years ago

That's technically untrue, which is obviously the best kind of untrue :) For example on linux you can get the PID for the process you started, check /proc/$PID/fd/ for a link of the form socket:[$INODE] and then check /proc/$PID/net/tcp for a (hex) local address of the form 0100007F:$PORT with a matching $INODE. I'm sure it's also possible on Windows (maybe by invoking netstat).

I tend to agree that we should consider printing the port unconditionally, however.

cgoldberg commented 7 years ago

it becomes useless, since the client will not be able to know which port it needs to connect on.

you can use lsof to find all ports that geckodriver is listening on:

$ lsof -i -n -P -l | grep gecko | awk '{ print $9 }' | awk -F':' '{print $2}' | uniq

twalpole commented 7 years ago

@jgraham @cgoldberg Yes - my statement was "technically" untrue :) One could write platform specific code to determine the port (and then in the @cgoldberg provided case attempt to determine which of those ports is the newest one so clients can bind to the correct one) thereby making things much more complicated for each client, or just print the port unconditionally. I'm glad you agree that printing the port unconditionally may be the better solution :)

andreastt commented 7 years ago

I’m sympathetic to the idea that we should find a better way to communicate the assigned port number back to the process that starts geckodriver. The many problems surrounding this is described eloquently in https://eklitzke.org/binding-on-port-zero.

Since geckodriver is an HTTP server, it’s not an option to use a Unix domain socket as advised in that blog post, but we should consider to write the assigned port to a temporary file somewhere. For example it would be possible to make geckodriver write the port to /tmp/geckodriver./port.

The parent process will be able to get the subprocess’ PID, look up the process’ namespace in the system temporary folder, and get the port number. This would make it possible to atomically assign the geckodriver HTTPD to a known free port, and would make away with any potential race conditions.

lock[bot] commented 5 years ago

This issue has been automatically locked since there has not been any recent activity after it was closed. If you have run into an issue you think is related, please open a new issue.