joelgriffith / navalia

A bullet-proof, fast, and reliable headless browser API
https://joelgriffith.github.io/navalia/
GNU General Public License v3.0
957 stars 33 forks source link

Test failing: connect ECONNREFUSED 127.0.0.1:XXXXX on CI #20

Closed mpeyper closed 7 years ago

mpeyper commented 7 years ago

I've been evaluating navalia for integration testing on a new project. I really like the api and found it very easy to get it up and running, so great job!

I have a basic test that just makes sure our React app has actually loaded:

const { Chrome } = require('navalia')

const testUrl = process.env.TEST_URL || 'http://localhost:3000'

console.log(`TEST_URL: ${testUrl}`)

describe('Entry App', () => {

  let chrome = {}

  beforeEach(() => {
    chrome = new Chrome()
  })

  afterEach(() => chrome.done())

  it('should load entry page', async () => {
    await chrome.goto(testUrl)
    await chrome.wait(500)
    expect(await chrome.exists('#root')).toBe(true)
  }, 30000)
})

I'm having an issue where the test runs fine on my machine (macOS Sierra 10.12.5, Chrome 59.0.3071.115) but it's not working on our CI server. The CI is running in a docker container (RHEL7, Chrome 59.0.3071.115) and the error it shows is:

connect ECONNREFUSED 127.0.0.1:38502
  at Object.exports._errnoException (util.js:1022:11)
  at exports._exceptionWithHostPort (util.js:1045:20)
  at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1146:14)

Interestingly, the test is also failing on another dev's machine (macOS Sierra 10.12.5, Chrome 60.something - don't have the exact version on hand) with a similar error but instead of 127.0.0.1 it has ::1 (IPv6).

So far the only machine it works on is mine (we have only tried these 3 but will be seeking others tomorrow to test), but I have no idea what I've done to be so special.

Any insights you may have are very much appreciated.

joelgriffith commented 7 years ago

Thanks for the feedback, appreciated! I'll have to dig into the handshake that happens at the websocket layer (which is why I think this is failing). Those machines that don't work either aren't opening up their ports for debuggers to connect or something else is blocking the connection.

Sorry your having issues, hopefully we can get to the bottom of this!

mpeyper commented 7 years ago

Just to clarify, you suspect it's the debugger port for Chrome that's it failing the connection?

joelgriffith commented 7 years ago

Yes, when the program starts up it launches chrome, and passes a flag to it with the port to open for debugging. It's my guess that Chrome isn't respecting that (don't have hard evidence in support of that, however).

A similar issue was once filed prior, and the solution there was to use Chrome Canary, which supports a wider set of remote functionality. My version outputs:

Google Chrome is up to date
Version 61.0.3156.0 (Official Build) canary (64-bit)

You might also attempt to run your tests like so:

$ DEBUG=* npm run test // replace with your test command

This will trigger all the underneath libraries to print out statements, which might give a better indicator of where your problem resides.

mpeyper commented 7 years ago

@joelgriffith

The DEBUG=* has produced a bit of a clue (as well as thousands of lines of babel output)

On the CI, the output is

navalia:chrome :getChromeCDP() > starting chrome +0ms
ChromeLauncher Waiting for browser..... +917ms
ChromeLauncher Waiting for browser....... +508ms
ChromeLauncher Waiting for browser......... +528ms
ChromeLauncher Waiting for browser........... +501ms
ChromeLauncher Waiting for browser............. +501ms
ChromeLauncher Waiting for browser............... +502ms
ChromeLauncher Waiting for browser................. +502ms
ChromeLauncher Waiting for browser................... +500ms
ChromeLauncher Waiting for browser..................... +502ms
ChromeLauncher Waiting for browser....................... +501ms
ChromeLauncher:error connect ECONNREFUSED 127.0.0.1:33944 +1ms
ChromeLauncher:error Logging contents of /home/bamboo/.tmp/lighthouse.ZkfpNHk/chrome-err.log +0ms
ChromeLauncher:error [0714/084153.449293:FATAL:zygote_host_impl_linux.cc(107)] No usable sandbox! Update your kernel or see https://chromium.googlesource.com/chromium/src/+/master/docs/linux_suid_sandbox_development.md for more information on developing with the SUID sandbox. If you want to live dangerously and need an immediate workaround, you can try using --no-sandbox.
ChromeLauncher:error  +0ms
navalia:chrome :done() > finished +2ms

We tried starting Chrome with

chrome = new Chrome({
  headless: true,
  disableGpu: true,
  hideScrollbars: true,
  noSandbox: true
})

but it didn't help.

mpeyper commented 7 years ago

We have also tried on another dev's machine and the tests work there too, so not sure what is wrong with the IPv6 one from the first post.

For the record, DEBUG=* output from my machine is

  navalia:chrome :getChromeCDP() > starting chrome +0ms
  ChromeLauncher Waiting for browser. +285ms
  ChromeLauncher Waiting for browser... +0ms
  ChromeLauncher Waiting for browser..... +511ms
  ChromeLauncher Waiting for browser.....✓ +1ms
  navalia:chrome :getChromeCDP() > chrome launched on port 59671 +36ms
  navalia:chrome :goto() > going to http://localhost:3000 +0ms
  navalia:chrome :goto() > waiting for pageload on http://localhost:3000 +1ms
  navalia:chrome :wait() > waiting 500 ms +22ms
  navalia:chrome :exists() > checking if '#root' exists +506ms
  navalia:chrome :done() > finished +4ms
  navalia:chrome :done() > closing chrome +0ms
  ChromeLauncher Killing all Chrome Instances +0ms
joelgriffith commented 7 years ago

Nice trace, that's helpful! Looks like the issue is in chrome-launcher package that Navlia uses. This issue seems quite similar even: https://github.com/GoogleChrome/lighthouse/issues/2661

Do you know what version of Chrome runs on those CI boxes? My sense is it's coming from a version incompatibility of some kind.

This issue too might be similar https://github.com/GoogleChrome/lighthouse/issues/2462

mpeyper commented 7 years ago

Chrome version is 59.0.3071.115 (same as the one in that issue)

mpeyper commented 7 years ago

It's the same version I am using though on my macbook though... just to confuse things.

joelgriffith commented 7 years ago

I'm going to be looking into running functional tests with Travis for this Project, so I'll see if I can repro there.

mpeyper commented 7 years ago

@joelgriffith

So I have an update... we did something really hacky (don't ask how) and replaced the flags provided to chrome to enforce the --no-sandbox flag and the test passes on the CI box.

When we used noSandbox: true like in the above comment, it had no effect. We're looking into why now.

joelgriffith commented 7 years ago

Definitely possible that Navalia isn't passing them through properly... I'll take a peak

joelgriffith commented 7 years ago

Shoot, think I found it:

chrome = new Chrome({
  headless: true,
  disableGpu: true,
  hideScrollbars: true,
  noSandbox: true
})

Should be:

chrome = new Chrome({
  flags: {
    headless: true,
    disableGpu: true,
    hideScrollbars: true,
    noSandbox: true
  }
})

Just checked, docs did have right (whew): https://joelgriffith.github.io/navalia/chrome/constructor/

mpeyper commented 7 years ago

Soooooo... yeah that was it. Removed our hacks and crazy long timeouts, set the flags correctly and we now have a passing integration test on out build.

Thanks so much @joelgriffith, you can close this one and file it under "user error".

As a side note, we still have one dev's machine that can't run it, but it seems to be something unique to his setup.

joelgriffith commented 7 years ago

Thanks for hunting it down! I need to capture these in a FAQ of some kind. Happy hacking!