teampoltergeist / poltergeist

A PhantomJS driver for Capybara
MIT License
2.5k stars 415 forks source link

Capybara::Poltergeist::TimeoutError #375

Closed dlandberg closed 8 years ago

dlandberg commented 11 years ago

I've got an Capybara::Poltergeist::TimeoutError with a simple line of code:

visit '/'

The debug output:

{"name"=>"set_debug", "args"=>[true]}
{"response"=>true}
{"name"=>"render", "args"=>["test.png", false]}
{"response"=>true}
{"name"=>"visit", "args"=>["http://127.0.0.1:54242/"]}
poltergeist [1376968991944] state default -> loading

Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:54242/"]}. 
It's possible that this happened because something
took a very long time (for example a page load was slow). 
If so, setting the Poltergeist :timeout
 option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker. (Capybara::Poltergeist::TimeoutError)

I'm not sure that it's a poltergeist bug. Seems like the following code in the web_socker_server.rb raises the timeout exception:

IO.select([socket], [], [], timeout) or raise Errno::EWOULDBLOCK

My environment: Mac OS Lion 10.7.5 phantomjs-1.9.1 poltergeist-1.3.0

larslevie commented 10 years ago

I had a solution that resolved/mitigated this issue. In my spec_helper I used an around block to clear session data after each example. But when I moved to Rails 4 it totally horked rspec, which now just hangs right after launch, without running a single example.

  config.before(:suite) do
    DatabaseCleaner.clean_with(:truncation)
  end

  config.around(:each) do |example|
    # Use really fast transaction strategy for all
    # examples except `js: true` capybara specs
    DatabaseCleaner.strategy = example.metadata[:js] ? :truncation : :transaction

    # Start transaction
    DatabaseCleaner.start

    # Run example
    example.run

    # Rollback transaction
    DatabaseCleaner.clean

    # Clear session data
    Capybara.reset_sessions!
  end
pboling commented 10 years ago

@MinutemanZ : You need to reset capybara sessions before you do the database clean.

    # Clear session data
    Capybara.reset_sessions!

    # Rollback transaction
    DatabaseCleaner.clean

I don't know if that's related to your problem, but it is what seems to be consensus.

afn commented 10 years ago

We had the same issue: intermittent failures with inexplicable timeouts. Like @michaelrkn, we had a feedback widget on our site; in our case, it was UserVoice. Removing it in the test environment fixed our problems as well.

I'll do some more experimentation to try to get to the bottom of this. My suspicion is that PhantomJS is somehow not resetting its state properly between sessions, and after a while, it becomes unresponsive if there are a sufficient number of pending requests to external services (e.g. UserVoice).

pboling commented 10 years ago

@afn That could be the cause of our intermittent failures as well. I hope you find something before you get all the way to the bottom!

m-o-e commented 10 years ago

For anyone still plagued by this. If you're running on Linux then the headless gem is your friend.

You basically just replace the poltergeist stuff in your spec_helper.rb with:

require 'headless'

h = Headless.new
h.start

Capybara.javascript_driver = :webkit

And make sure xvfb is installed (apt-get install xvfb). From there it was smooth sailing for us.

afn commented 10 years ago

After spending most of the day on this issue, I've come to believe that there's no race condition or anything like that; rather, it's simply timing out trying to load some external resource on the page. The error message (e.g. Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:54242/"]}) is a bit misleading, since it's not the request to http://127.0.0.1:54242/ that's timing out, but rather than request to the externally-hosted JS file (or whatever).

This may be desirable in some cases (when your tests are truly end-to-end and are expected to fail when some third-party service you depend on is unavailable), but otherwise the best approach is probably to use WebMock to mock external APIs (and mask off things that you don't need for your tests, like UserVoice/Usabilla).

yaauie commented 10 years ago

@afn thank you.

It would be nice to compile some patterns to help people deal with this, whether it is disabling widgets or mocking things to improve load-time within phantomjs.

afn commented 10 years ago

@yaauie What I ended up doing was to use WebMock to prevent access to external services, and mock those services that we specifically need.

We used to explicitly allow connections to external resources in spec/spec_helper.rb:

RSpec.configure do |config|
  config.before(:all, type: :feature) do
    WebMock.allow_net_connect!
  end

  config.after(:all, type: :feature) do
    selenium_requests = %r{/((__.+__)|(hub/session.*))$}
    WebMock.disable_net_connect! :allow => selenium_requests
  end
end

Now we don't; we changed this to:

RSpec.configure do |config|
  config.before(:each, type: :feature) do
    WebMock.disable_net_connect!(:allow_localhost => true)
  end
end

There's one feature spec that depended on an external API, so we've used stub_request to mock that API call.

The downside, of course, is that we no longer have any tests that ensure that our system works end-to-end including all external dependencies. To solve this, we're planning on adding another test suite that runs against our staging environment, rather than in our CI environment, and which is intended to fail if outside resources we depend on are unavailable.

Best, Tony

EDIT: I'm an idiot. WebMock will stop our Ruby code from talking to outside services, but it won't stop PhantomJS from doing the same.

That said, I think the following will work to prevent PhantomJS from talking to the outside world:

Capybara.register_driver :poltergeist do |app|
  Capybara::Poltergeist::Driver.new(app, timeout: 60, phantomjs_options: ['--proxy-type=socks5', '--proxy=0.0.0.0:0'])
end

PhantomJS apparently doesn't use the proxy when connecting to localhost, so it will only fail (because the proxy address is bad) to connect to other hosts.

lsimoneau commented 10 years ago

I'm not sure I buy that all these errors are actual timeouts loading external scripts. Sometimes the tests fail in under 10 seconds, which isn't enough time for an actual timeout to have occurred, right?

On Fri, May 9, 2014 at 7:49 AM, Tony Novak notifications@github.com wrote:

After spending most of the day on this issue, I've come to believe that there's no race condition or anything like that; rather, it's simply timing out trying to load some external resource on the page. The error message (e.g. Timed out waiting for response to {"name":"visit","args":[" http://127.0.0.1:54242/"]}) is a bit misleading, since it's not the request to http://127.0.0.1:54242/ that's timing out, but rather than request to the externally-hosted JS file (or whatever).

This may be desirable in some cases (when your tests are truly end-to-end and are expected to fail when some third-party service you depend on is unavailable), but otherwise the best approach is probably to use WebMock to mock external APIs (and mask off things that you don't need for your tests, like UserVoice/Usabilla).

— Reply to this email directly or view it on GitHubhttps://github.com/teampoltergeist/poltergeist/issues/375#issuecomment-42620085 .

afn commented 10 years ago

@lsimoneau You're right -- I take it all back. The errors became less frequent, but we're still seeing these timeouts intermittently, even with absolutely no connections being made to servers other than localhost.

afn commented 10 years ago

I managed to get the timeout again with debug output enabled in PhantomJS:

Capybara.register_driver :poltergeist do |app|
  Capybara::Poltergeist::Driver.new(app, timeout: 60, phantomjs_options: ['--proxy-type=socks5', '--proxy=0.0.0.0:0', '--debug=true'])
end

and now I'm thinking that it's a PhantomJS bug. The page load seems to be getting stuck:

2014-05-14T21:09:10 [DEBUG] WebPage - setupFrame ""
2014-05-14T21:09:10 [DEBUG] WebPage - evaluateJavaScript "(function() { return (function () {
      return typeof __poltergeist;
    })(); })()"
2014-05-14T21:09:10 [DEBUG] WebPage - evaluateJavaScript result QVariant(QString, "undefined")
2014-05-14T21:09:10 [DEBUG] WebPage - updateLoadingProgress: 21
2014-05-14T21:09:10 [DEBUG] CookieJar - Saved "_closing_time_session=a2srWkFzamlHNDV1bWZXVU1aaTFoTDIvbFp1bXFCVDBqbUY4eEZrcGJjYzRWT01iZXRGd0pGQzRFVVNpaDZlUUlJNHVCalRGL0xYOWJCTlZCQ3BSQ2c9PS0tbDk2YXFyQmJLQWE4U0dZOStkZkZUZz09--2bccd000e366d462b6108752e31c9f76e4921a3d; HttpOnly; domain=127.0.0.1; path=/"
2014-05-14T21:09:10 [DEBUG] CookieJar - Saved "browser.timezone=Etc%2FUTC; expires=Thu, 14-May-2015 21:09:10 GMT; domain=127.0.0.1; path=/"
2014-05-14T21:09:11 [DEBUG] Network - Resource request error: 99 ( "Connection to proxy refused" ) URL: "http://widget.uservoice.com/DQpcgONtvcxXS9T8bem0iw.js"
2014-05-14T21:10:10 [DEBUG] WebPage - evaluateJavaScript "(function() { return (function () {
      return typeof __poltergeist;
    })(); })()"

Notice the 60-second gap after the "Connection to proxy refused" error (which, by the way, is expected: we intentionally set the proxy address to 0.0.0.0 to avoid making connections to outside hosts).

During a normal run, we see this instead:

2014-05-14T21:11:18 [DEBUG] WebPage - setupFrame ""
2014-05-14T21:11:18 [DEBUG] WebPage - evaluateJavaScript "(function() { return (function () {
      return typeof __poltergeist;
    })(); })()"
2014-05-14T21:11:18 [DEBUG] WebPage - evaluateJavaScript result QVariant(QString, "undefined")
2014-05-14T21:11:18 [DEBUG] WebPage - updateLoadingProgress: 21
2014-05-14T21:11:18 [DEBUG] CookieJar - Saved "remember_user_token=BAhbB1sGaQZJIiIkMmEkMDQkNi9ZUWtaTEt1NG9XcFBCcndvc0VQLgY6BkVU--b240cb945c3753a444591b05477cc0833732faaa; HttpOnly; expires=Wed, 28-May-2014 21:11:18 GMT; domain=127.0.0.1; path=/"
2014-05-14T21:11:18 [DEBUG] CookieJar - Saved "email=agent%2B26%40amitree.com; domain=127.0.0.1; path=/"
2014-05-14T21:11:18 [DEBUG] CookieJar - Saved "_closing_time_session=ZzkwQkZOQmlVK0pzMmE0b3JwL1lpWFM5NmdhNlRqcUlXL0dHSXR6dzEvaHowWU9TOTRvdXBVOEY3Z0lvTG5CS2lNdnRpUVlJbEF4USs4RGxSS1ZuMlJQVDJkVEIrVWc2L0R3Q1R4Rk9oSUs2UFpMZTY1K2l3ZHFMRURPWllodUJrbHYrMm93TzRDZ0FDYkdMblNoSFNtZjAvZFl6K0h6b002NVRTcWZvZmYrNWEzQ2tiVkUzL29ZRDUvOExPcHhJUEF6eEV4VzJpNkg2QkwyWXlNUVZFWTdNdXRldmJTZEVKUkt6WDh3cExPdkFIQWpYS3ZhaVJ2eWplQUJuUnF0Um83YkNuK1RyalQwUmxqK0d2N3pMZjdtQWhhQ21kUFFxYmlsbWx1b2drbisrSzlTQTh5VnE4UUFBOXdoM2ZZVlItLTFDblBkTlFuUVRXaTFURTVvaC9nMkE9PQ%3D%3D--15611db3a5bee80bbcc703939b905982101d9db3; HttpOnly; domain=127.0.0.1; path=/"
2014-05-14T21:11:18 [DEBUG] CookieJar - Saved "browser.timezone=Etc%2FUTC; expires=Thu, 14-May-2015 21:11:18 GMT; domain=127.0.0.1; path=/"
2014-05-14T21:11:18 [DEBUG] Network - Resource request error: 99 ( "Connection to proxy refused" ) URL: "http://widget.uservoice.com/DQpcgONtvcxXS9T8bem0iw.js"
2014-05-14T21:11:18 [DEBUG] WebPage - updateLoadingProgress: 100

Time to start digging into phantomjs to figure out how it might be getting stuck. Anyone have any thoughts on this?

pboling commented 10 years ago

@afn Are you on phantomjs 1.9.7 installed via homebrew?

afn commented 10 years ago

@pboling I have 1.9.7 via homebrew on my dev box, but I haven't been able to reproduce the problem there recently. Where I have been able to reproduce it pretty consistently is on CircleCI, running PhanomJS 1.9.7 on Ubuntu.

lsimoneau commented 10 years ago

@pboling same here, haven't managed to reproduce locally, but happens fairly regularly on CircleCI

afn commented 10 years ago

Starting to dive into this from the phantomjs end. Haven't gotten very far yet, but FWIW strace is showing that it's stuck selecting from file descriptors 3 and 9; lsof shows:

phantomjs 9894 ubuntu    0u   CHR   136,1      0t0       4 /dev/pts/1
phantomjs 9894 ubuntu    1w  FIFO     0,8      0t0 3202544 pipe
phantomjs 9894 ubuntu    2u   CHR   136,1      0t0       4 /dev/pts/1
phantomjs 9894 ubuntu    3r  FIFO     0,8      0t0 3202546 pipe
phantomjs 9894 ubuntu    4w  FIFO     0,8      0t0 3202546 pipe
phantomjs 9894 ubuntu    5r  FIFO     0,8      0t0 3202548 pipe
phantomjs 9894 ubuntu    6w  FIFO     0,8      0t0 3202548 pipe
phantomjs 9894 ubuntu    7r  FIFO     0,8      0t0 3223009 pipe
phantomjs 9894 ubuntu    8w  FIFO     0,8      0t0 3223009 pipe
phantomjs 9894 ubuntu    9u  IPv4 3202550      0t0     TCP 127.0.0.1:57318->127.0.0.1:59770 (ESTABLISHED)
phantomjs 9894 ubuntu   10r  FIFO     0,8      0t0 3223331 pipe
phantomjs 9894 ubuntu   11w  FIFO     0,8      0t0 3223331 pipe
phantomjs 9894 ubuntu   14r  FIFO     0,8      0t0 3223022 pipe
phantomjs 9894 ubuntu   15w  FIFO     0,8      0t0 3223022 pipe
phantomjs 9894 ubuntu   21u   REG    0,49     3072 1012555 /home/ubuntu/.qws/share/data/Ofi Labs/PhantomJS/http_127.0.0.1_58526.localstorage

FD 9 is Poltergeist, and FD 3 is a pipe whose other end is FD 4. Anyone know what this pipe is for? I imagine another thread would be writing to this pipe when the page finishes loading?

Here are the threads currently running:

(gdb) info threads
  Id   Target Id         Frame
  5    Thread 0x7f359dc0d700 (LWP 9895) "phantomjs" 0x00007f359f64ad84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  4    Thread 0x7f359d40c700 (LWP 9896) "QThread" 0x00007f359eb5a763 in select () from /lib/x86_64-linux-gnu/libc.so.6
  3    Thread 0x7f3556e74700 (LWP 9901) "QThread" 0x00007f359f64ad84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
  2    Thread 0x7f3557b9b700 (LWP 10030) "QThread" 0x00007f359eb5a763 in select () from /lib/x86_64-linux-gnu/libc.so.6
* 1    Thread 0x7f35a0330740 (LWP 9894) "phantomjs" 0x00007f359eb5a763 in select () from /lib/x86_64-linux-gnu/libc.so.6

Unfortunately I can't get much deeper because the phantomjs binary I'm using doesn't have debug symbols. I'm recompiling from source now in debug mode. Hopefully the bug doesn't go away in debug mode :-D

afn commented 10 years ago

OK, so since this seems to be a phantomjs bug, rather than a Poltergeist one, I've opened an issue there: https://github.com/ariya/phantomjs/issues/12234.

zpieslak commented 10 years ago

If anyone interested. This configuration resolved my issues with timeouts:

Capybara.javascript_driver = :poltergeist
options = { js_errors: false, timeout: 180, phantomjs_logger: StringIO.new, logger: nil, phantomjs_options: ['--load-images=no', '--ignore-ssl-errors=yes'] }
Capybara.register_driver(:poltergeist) do |app|
  Capybara::Poltergeist::Driver.new app, options
end
shahzebkhan commented 10 years ago

I can confirm that @zpieslak 's solution works.

shahzebkhan commented 10 years ago

Never mind. It seemed to work for a few days, but now we are seeing the same error again. Weird.

AlreadyTalk commented 10 years ago

Whit some tests, I detect what the my problem is. We can´t take a screenshot if the server its on the same machine. I pass to explain. I try to take a screenshot from my machine (that is the server, its on development ) and it gives me the timeout. But if a try to take a screenshot from the same page, but its running on my associate machine, it works. I just don't now how to resolve this. I'm using Poltergeist, Capybara and PhantomJS

apeniche commented 10 years ago

I'm using the latest version of Poltergeist, PhantomJS and Capybara and my tests randomly fail with this error (Capybara::Poltergeist::TimeoutError) about 25% of the time, is there any solution to this?

rickmzp commented 10 years ago

Have you tried increasing the timeout? It could be that the pages sometimes load slower than expected.

One thing that's helped me debug these kinds of issues is switching the driver to the default Capybara selenium driver that uses Firefox. Have you tried that to see if you get a different error?

apeniche commented 10 years ago

Yeah, I've increased the timeout up to 180 seconds and I get the same output (sometimes it works, sometimes it doesn't, in the same proportion), when it works it takes less than 5 seconds to load... selenium works just fine, but I want to get it working on poltergeist to speed up my test suite

futhr commented 10 years ago

I start to face this randomly when my feature specs getting heavy, so my issue equals @apeniche's.

Failures:

  1) Cities index displays the list
     Failure/Error: ensure_on cities_path
     Capybara::Poltergeist::TimeoutError:
       Timed out waiting for response to {"name":"visit","args":["http://127.0.0.1:59837/cities"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker.
     # ./spec/support/capybara.rb:18:in `ensure_on'
     # ./spec/features/cities_spec.rb:10:in `block (3 levels) in <top (required)>'

UPDATE:

I solved my issues after applying these steps:

a, add a bit modified version from snippet above. b, ensure specs can be run in offline mode.

  Capybara.register_driver(:poltergeist) do |app|
    Capybara::Poltergeist::Driver.new app,
      js_errors: false,
      timeout: 180,
      phantomjs_logger: Puma::NullIO.new,
      logger: nil,
      phantomjs_options:
      [
        '--load-images=no',
        '--ignore-ssl-errors=yes'
      ]
  end
jure commented 10 years ago

+1, same issue. Worked fine, now suddenly times out every time.

Update: For me it looks like it's a networking issue with Vagrant and DNS resolving. So not really a bug in poltergeist, sorry for the false alarm.

afn commented 10 years ago

FYI, I posted a workaround in https://github.com/ariya/phantomjs/issues/12234 (which, incidentally, is probably the right place to make some noise, as this does appear to be a phantomjs bug).

Here it is: https://gist.github.com/afn/c04ccfe71d648763b306. Pretty simple, really; it just catches the failure, restarts phantomjs, and retries the spec. We haven't had any build failures caused by this bug since putting this into place.

ka8725 commented 10 years ago

@zpieslak, your config's solved my issue. Thanks!

mckinnsb commented 10 years ago

Just so everyone knows, for me it was the --ignore-ssl-errors configuration that solved the issue over here. We had a development asset server that had some wonky cert problems, and that was causing the error.

cbrwizard commented 10 years ago

Yo! I faced the same problem and I tried everything to solve it but the bug seemed to appear completely random. It turned out that when I didn't turn on private pub process in terminal with this command rackup private_pub.ru -s thin -E production all tests passed without problems. It looks like that this bug is somehow related to faye and websockets.

bleargh45 commented 10 years ago

@cbrwizard after seeing your note about WebSockets, I went and had a look at what we were doing and why our application hangs consistently with Poltergeist/PhantomJS, and saw similar... if I commented out the chunk of JS we use to setup our EventSource, things didn't stall any more. While I've now got failing tests, I can second the notion that this is somehow related to how those long-held connections are being handled.

rafaeljesus commented 10 years ago
route commented 10 years ago

Guys with WebSocket problems could you please create a simplified repo with this bug? I'll be glad to take a look and fix it.

claptimes5 commented 10 years ago

I used brew and uninstalled and reinstalled phantomjs. The spec started working again.

robwierzbowski commented 10 years ago

Having similar issues here on an app that uses long polling or websockets. My first thought was it could be related to Capybara's automatic waiting and JS constantly evaluating, but IIRC the auto waiting polls, so that's probably not it.

dgehrett commented 10 years ago

I had similar issues as well: random timeouts and randomly failing tests. None of these proposed solutions wound up working for me, but then I had noticed that it seemed the phantomjs (1.9.7) process was gobbling up more and more memory as my test suite ran. I figured it was probably a memory issue causing the random failures and timeouts.

The only thing that worked for me was restarting restarting phantomjs in between test runs..

page.driver.restart if page.driver.respond_to? :restart
g-ilham commented 10 years ago

i'm using in Gemfile.lock:

mime-types (2.4.2)
capybara (2.2.1)
poltergeist (1.5.1)
websocket-driver (0.3.5)

I added a check to the test environment for some plug-in in my views:

 - unless Rails.env.test?

i also added:

 after { Capybara.reset_sessions! }

All this helped to solve the problem Inclined to think that the main reason is the js plugins. Not required plug-ins in a test environment to exclude! Also, try to divide the tests. Poltergeist is often no time to load the entire environment because of this there are knitted errors. I got rid of it by adding a sleep (1).

ZachBeta commented 10 years ago

We were finding the biggest issue was the size of our assets directory. Since our test suite was compiling assets on request, the first request was often slowed down enough that it would timeout on our leaner build machines but run just fine on our heartier development machines.

We have a number of shell scripts that setup our config files so we added a simple bundle exec rake assets:precompile 2>/dev/null to precompile our assets and pipe the output to dev/null to keep our build logs from getting too noisy.

EDIT: Slightly longer blog post writeup Breaking the Build. Breaking the Build. – Continuity Control Engineering

aprescott commented 10 years ago

@ZachBeta I think this should also help avoid that problem:

config.before(:all, type: :feature) do
  visit "/assets/application.css"
  visit "/assets/application.js"
end

It's avoided any problems fairly well where I work.

adayag commented 10 years ago

We have been testing our facebook login flow and have been getting consistent failing tests as of last week. There have been no significant changes in our setup since then and running previously green test suites have resulted in the same failing result.

We are using poltergeist 1.5.1


World(Module.new do
  def window_handles
    page.driver.browser.window_handles
  end

  def within_other_window(&block)
    expect(window_handles).to_not be_empty
    sleep 0.5
    page.within_window window_handles.last, &block
  end
end)
email = "charlie_kxmedmp_fergiesky@xxxxx.net"

When /^I log in via facebook.*$/ do
  expect do
    page.find(".facebook-login-button").click
  end.to change { window_handles.length }

  within_other_window do
    fill_in 'email', with: email
    fill_in 'pass', with: 'password'
    click_button 'Log In'
  end
  window_count = window_handles.length
  expect(page).to have_selector "#bar"
end
Timed out waiting for response to {"name":"push_window","args":["f36b411434"]}. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the Poltergeist :timeout option to a higher value will help (see the docs for details). If increasing the timeout does not help, this is probably a bug in Poltergeist - please report it to the issue tracker. (Capybara::Poltergeist::TimeoutError)
./features/step_definitions/facebook_steps.rb:9:in `within_other_window'
./features/step_definitions/facebook_steps.rb:19:in `/^I log in via facebook.*$/'
./features/support/hooks.rb:25:in `call'
./features/support/hooks.rb:25:in `block in <top (required)>'
features/acquisitions.feature:55:in `When I log in via facebook again'
Capybara.javascript_driver = :poltergeist
Capybara.server_port = 51674
Capybara.default_wait_time = 14
Capybara.register_driver :poltergeist do |app|

  Capybara::Poltergeist::Driver.new(app, timeout: 60, js_errors: true,
                                    phantomjs_logger: Capybara::Poltergeist::FilteringLogger,
                                    phantomjs_options: %w[--load-images=no])
end

We have tried multiple workarounds posted on this thread and others but to no avail. It was previously intermittent within an acceptable failure rate but as we haven't had a green build in a week we are unsure what to do. Any advice would be appreciated.

jmccartie commented 9 years ago

@futhr 's solution fixed it for me. Started happening after I switched from Unicorn to Puma

agenteo commented 9 years ago

@ZachBeta tried your solution, looked really promising. After 3 days it's back.

My timeout is set to 180.

To @aprescott and whoever posted a solution I'd love to hear for how long this flaky tests disappeared for.

If you run 10/15 builds a day you should wait at least one or two weeks before calling it gone. This is how we're trying to keep track of it http://teotti.com/a-process-to-identify-and-monitor-inconsistently-failing-automated-tests/#document-the-flaky-test

To others seeing this flaky test perhaps https://github.com/dblock/rspec-rerun or https://github.com/y310/rspec-retry might help.

claptimes5 commented 9 years ago

I've had this issue in the past and was wondering if everyone here has been using database_cleaner? I came across a recent issue where database_cleaner left open transactions in the database. This prevented the deletion or truncation calls from completing. Basically they timed out or hung forever.

https://github.com/DatabaseCleaner/database_cleaner/issues/292 https://github.com/DatabaseCleaner/database_cleaner/issues/273

I made a fork that checks (hacks) if there are any open transactions when deleting/truncating (https://github.com/claptimes5/database_cleaner/commit/347b74b2ae2554e970ff3e9a5f63c3a6a4282f22. It's tough, since then poltergeist failures are intermittent, but can someone check and see if this helps?

shedd commented 9 years ago

@claptimes5 thanks! interesting theory. we are using DatabaseCleaner and I gave your fork a try. Unfortunately, we still hit the timeout errors. It's possible that this may help, but I was unable to confirm that.

claptimes5 commented 9 years ago

@shedd What database are you using? I had to stop/start my instance to reset the sessions.

shedd commented 9 years ago

Postgres. We reproduced the timeouts on our CI service, so it's a fresh instance/container on each test run.

Petercopter commented 9 years ago

:+1: Using Postgres, database_cleaner, Capybara, Poltergeist, and Rails 4. Timing out. Still debugging. Works on Rails 3.x.

andrewhao commented 9 years ago

On our project, none of the above solutions worked until we disabled rack-timeout in test environments: https://github.com/heroku/rack-timeout/issues/55

showaltb commented 9 years ago

The problem for me is related to Ajax. Capybara makes a call to driver.reset! after every scenario. If an Ajax call is still running, I get the timeout error, with all subsequent tests failing.

If I wait for Ajax (wait for page.evaluate_script('$.active > 0') to become false), everything is good.

shedd commented 9 years ago

On our project, none of the above solutions worked until we disabled rack-timeout in test environments: heroku/rack-timeout#55

Thanks @andrewhao - disabling rack-timeout was actually a big help for us!

pmorton commented 9 years ago

Hi all. In our environment I found that the timeout was caused as a result of a lot of assets being rendered. From OSX I was able to use opensnoop (a dtrace tool) to see what files were being opened. Every time poltergeist would timeout, it was a result of the ruby process trying to compile assets. I achieved test stability by pre-compiling assets for test and setting config.assets.digest = true.