karma-runner / karma

Spectacular Test Runner for JavaScript
http://karma-runner.github.io
MIT License
11.96k stars 1.71k forks source link

Karma tests only run half the time in Docker environment #3359

Open EricZLou opened 5 years ago

EricZLou commented 5 years ago

Are there any known race conditions that happen when Karma is used with Docker? The test runs correctly sometimes. Example: I run karma start and the tests don't run correctly (shown below), but when I run it a second time, it might run correctly (also shown below).

Expected behaviour

The tests run consistently every time the karma server is started with ./node_modules/karma/bin/karma start ./karma.conf.ts. I am in a Docker environment.

Actual behaviour

Sometimes, the tests run to completion correctly. Other times, the tests do not run at all.

When the tests run correctly: image

When the tests don't run correctly: image

Environment Details

johnjbarton commented 5 years ago

There isn't anything that karma-runner can do here. The browser has been loaded and now it is in control of the action. karma is just waiting to hear back; when it doesn't then it kills the process.

You can try setting the ping timeout to a larger value: https://github.com/karma-runner/karma/pull/3355/commits/15ae12f9199635d6586311fd8905bd579dfa1acb Requires v4.3

EricZLou commented 5 years ago

Got it. Thanks a lot for your help once again @johnjbarton

EricZLou commented 5 years ago

Hi @johnjbarton I have upgraded karma to v4.3 and set pingTimeout:10000 in karma.conf.ts:

image

But it still seems to be using 2000ms as the ping timeout instead of 10000ms.

image

Is there anywhere else I would need to set the ping timeout?

EricZLou commented 5 years ago

Also just made an observation that Karma disconnects in the middle of Fetching/Requesting. Could this be the cause of the problem? Is this also related to setting a higher ping timeout?

johnjbarton commented 5 years ago

Please run with --logLevel=debug and verify that the final config shows the ping timeout value you expect.

Also just made an observation that Karma disconnects in the middle of Fetching/Requesting. Could this be the cause of the problem? Is this also related to setting a higher ping timeout?

If one of the files fetched crashes the browser or spins the CPU without yielding to the event-loop, then the browser won't respond to pings. No setting for timeout will fix this: the race in you code needs to be fixed.

Note that CI machines are often much less powerful than desktop machines so tests may pass locally and fail in CI under this race condition.

You can check to see if it happens in the same file each time. Add logging to top-level of files to see progress.

In general any test that fails with timeout should be suspect first: the failing timeout is a symptom not a cause.

EricZLou commented 5 years ago

I'm running the command node .\node_modules\karma\bin\karma start .\core\tests\karma.conf.ts --logLevel=debug with v4.3 and the final config still doesn't seem to be printing out. The output starts with a very large json, then "Loading plugin ... ", then "Processing ... ", etc.

When I Ctrl+f for "timeout", I don't get any relevant results. When should the final config be printed?

Also, Karma disconnects 30 seconds after the Karma server starts, not at any specific file.

johnjbarton commented 5 years ago

The output starts with a very large json,

I guess that is the config. Search for the string Final config.

I think the most likely explanation for your issue is a race in your test or app code. karma is doing all it can do: it's putting your app and test code in the browser with the test framework. Then it's waiting for test results. No one ever shows up. I guess that is because the test or app code is burning CPU, not giving a single event loop turn to socket.io or the test framework.

These cases are hard to solve. I suggest you start by commenting out as much of your test as possible, even all of it, just a single console.log() call. If that still fails, cut out your app code 1/2 at a time. And so on to isolate the race.

EricZLou commented 5 years ago

Ah yes. The output was too large so it was getting cropped. I see the final config now. The pingTimeout is indeed set to what the config has. I notice that the default value for the pingTimeout (when I don't specify it in the config) is 5000. However, the error screenshots above say 2000ms (which is neither 5000 nor the 10000 I specify in the config).

And thanks for your recommendations! I'll try that out.

semanio commented 4 years ago

I am posting a comment because I'm having a similar issue with Karma inside Docker where my junit-reporter wasn't creating an XML file. I have adjusted the pingTimeout in Karma v5.0.8 and I'm still getting (with a larger test suite (6700 tests)) into a state of when the browser (socket) is having to reconnect before it begins executing/running the tests. When I debug with Karma, I see the following (key) log messages print out just before my test runner starts to execute:

Why that is key

When looking in browser.js inside the reconnect function; that second message is near this line: this.setState(EXECUTING). The only other place that state is set in the file is in the onStart function (which does not get called in this situation). Therefore, I tested adding this line: this.emitter.emit('browser_start', this) to the reconnect function right below this.setState(EXECUTING); I'm now getting the expected results in my junit-reporter. Would this be an acceptable fix? If so, I make a Pull Request. I have made this commit on my fork: https://github.com/semanio/karma/commit/f587b69f3362f5ef2d0a0d4e371b9477e5193a33

Extra context

I have timed it between registering the first socket to these messages and there is about a 34/35 seconds delay (within Docker). Per the first message, the reconnect honors the pingTimeout config; however, it seems like the initial connection is not. As I am writing this, I have yet to track down why this is the case.

johnjbarton commented 4 years ago

Unfortunately the reporters are not designed to be resilient to reconnect. They may assume two browsers exist if the browser_start is sent twice. So we can't send browser_start twice.

I guess we need to introduce another state, CONFIGURING_DISCONNECTED to ensure that EXECUTING_DISCONNECTED always means that we did enter EXECUTING before disconnect. In your scenario, your disconnect should be during CONFIGURING_DISCONNECTED and when you get to reconnect you can verify that state and emit browser_start to enter EXECUTING.

Does that make sense?

semanio commented 4 years ago

@johnjbarton - make sense. I am definitely in CONFIGURING_DISCONNECTED state with this situation. I'll see if I can work up that fix.

To be clear; in this case, it's not calling browser_start at all in this situation without my proposed fix. I am actively testing that out though against a small test suite to be sure. Regardless, what you are saying as the proper fix makes sense.

yogeshkrishnani commented 3 years ago

We're facing the same issue as @semanio mentioned. If a browser gets disconnected during run, and reconnects on the same socket, junit-reporter isn't generating an XML file.

At the end, I see the following error:

ERROR: Step ‘Publish JUnit test result report’ failed: No test report files were found. Configuration error?

@semanio : Were you able to resolve the issue through any other ways? Because I see that the changes suggested by you haven't been merged.

@johnjbarton Can we please look into this scenario? I also believe browser_start event is not called once browser disconnects and reconnects, ideally it should.

semanio commented 3 years ago

@yogeshkrishnani I got sidetracked at work and never worked on the alternative fix that @johnjbarton and I discussed in this reply; I'll see if I can circle back to it in the near future.

johnjbarton commented 3 years ago

I wonder now if this issue is actually one we see elsewhere. Some analysis by a colleague came up with this explanation:

A workaround we have tried is to set the karma pingTimeout irrationally large, 90s.

I think the correct fix is to re-work our socket io code based on the latest implementation.

johnjbarton commented 3 years ago

Note that karma 6x is built on socket.io v3 which has reversed ping/pong code: https://github.com/socketio/engine.io-client/commit/81d7171c6bb4053c802e3cc4b29a0e42dcf9c065