Have the global timeout take into account `t.timeout()` durations

spectranaut commented 4 years ago

What you're trying to do

I appreciate that having a global timeout in order to catch unexpected behavior in a test and not consume too much CI resources. However, some tests take a long time intentionally, and I would like to be able to use t.timeout() to explicitly override the global timeouts for just one test. Right now, you can only override the global timeouts if then t.timeout() is shorter than the global timeout. It would be nice if it overrides when t.timeout() is longer as well.

I think I am not the only one, as I found this question referring to the same problem.

Why you can't use AVA for this

There is no way to have a global timeout that is something like 10s and a timeout for a particular test that is longer, like 30s.

And maybe how you think AVA could handle this

I'm not sure how the timeouts are coded, it might be a one-liner :)

novemberborn commented 4 years ago

The global timeout is a "stop tests in case they're stuck" kind of timeout.

I suppose we could interpret that as "stop tests in case they're stuck, but don't worry for the next 30 seconds since the test will fail if it's stuck anyway".

The workers will have to send a message to the main process whenever a per-test timeout starts, with the timeout duration. And then in the main process we'd have to make sure we don't stop tests because they're stuck until that duration has passed.

This should work with multiple t.timeout() calls, of course.

I don't think it's quite a one-liner, but this approach is consistent with what the global timeout is there for, so let's do it.

sramam commented 4 years ago

How is described behavior different from removing the global timeout and setting a default timeout on every test?

Basically,

remove global timeout
by default, t.timeout(10000)
advertise this and let the user modify as necessary.

The net effect is the same IMHO.

novemberborn commented 4 years ago

When tests run concurrently, within and across worker processes, their duration may vary wildly. The global timeout (misnamed as it may be) is only concerned with making sure something is happening in your test suite.

Individual timeouts can be useful in certain cases, but far from all.

sramam commented 4 years ago

Disclaimer: I have now found my peace with the nested timeout and can live with the timeout change.

However, it seems my initial concerns are being experienced/reported by others. Hopefully, the example below provides an insight.

The output below is meant to be slightly hyperbolic, but convey the confusion I sensed with the timeout change on first blush.

$ npx ava dist/sdk/test/*.js

⠴ Server listening at http://127.0.0.1:54369

  ✖ Timed out while running tests

  3 tests were pending in dist/sdk/test/01_core-wait-for-registry.js

  ◌ 01_core-wait-for-registry › inProgress
  ◌ 01_core-wait-for-registry › isErrored
  ◌ 01_core-wait-for-registry › isAlive

  18 tests were pending in dist/sdk/test/01_core.js

  ◌ 01_core › waitFor server-state tests
  ◌ 01_core › getAllProviderMetadata !refresh
  ◌ 01_core › getAllProviderMetadata
  ◌ 01_core › getProviderMetadata aws
  ◌ 01_core › getRelativePath 0.0.2-rc1
  ◌ 01_core › getRelativePath abra
  ◌ 01_core › getRelativePath git-url#0.0.2-rc1
  ◌ 01_core › getRelativePath git-url#hash
  ◌ 01_core › getRelativePath file#hash
  ◌ 01_core › getRelativePath non-existant version
  ◌ 01_core › getRelativePath non-existant hash
  ◌ 01_core › getProviderMetadata aws w/ refresh
  ◌ 01_core › getProviderMetadata bad provider
  ◌ 01_core › getProviderMetadata github@0.0.1
  ◌ 01_core › getProviderMetadata aws@1.23.0 (non-existant version)
  ◌ 01_core › publish invalid type
  ◌ 01_core › shutdown server - work from cache after
  ◌ 01_core › getProviderMetadata github from cache

This output is completely useless in diagnosing any problem that might exist. I know there is a timeout, I just have no idea where to look for the cause.

The only way to debug this:

 npx ava dist/sdk/test/*.js -T 30000

 npx ava dist/sdk/test/*.js -T 120000

and so on, with the hope that it wasn't an intermittent failure.

As a thought experiment, with per test timeouts, the output could look like:

  3 tests were pending in dist/sdk/test/01_core-wait-for-registry.js

  ◌ 01_core-wait-for-registry › inProgress
  ◌ 01_core-wait-for-registry › isErrored
  ◌ 01_core-wait-for-registry › isAlive

  18 tests were pending in dist/sdk/test/01_core.js

  ✖ 01_core › waitFor server-state tests  - Timed out after 10 seconds
  ◌ 01_core › getAllProviderMetadata !refresh
  ◌ 01_core › getAllProviderMetadata
  ◌ 01_core › getProviderMetadata aws
  ◌ 01_core › getRelativePath 0.0.2-rc1
  ◌ 01_core › getRelativePath abra
  ◌ 01_core › getRelativePath git-url#0.0.2-rc1
  ◌ 01_core › getRelativePath git-url#hash
  ◌ 01_core › getRelativePath file#hash
  ◌ 01_core › getRelativePath non-existant version
  ◌ 01_core › getRelativePath non-existant hash
  ◌ 01_core › getProviderMetadata aws w/ refresh
  ◌ 01_core › getProviderMetadata bad provider
  ◌ 01_core › getProviderMetadata github@0.0.1
  ◌ 01_core › getProviderMetadata aws@1.23.0 (non-existant version)
  ◌ 01_core › publish invalid type
  ◌ 01_core › shutdown server - work from cache after
  ◌ 01_core › getProviderMetadata github from cache

In most cases, that simple fault isolation in the middle,

  ✖ 01_core › waitFor server-state tests  - Timed out after 10 seconds

will result in a substantial quality of life improvement.

I appreciate that schedulers managing workers is a serious and complex problem, but this nested timer business makes it harder to grok from the test writer point of view.

Hopefully this helps. I'll go back to holding my peace....

novemberborn commented 4 years ago

@sramam are those tests using serial? Are they async? It's quite possible that the list here does not actually take into account which tests started, just those which haven't completed yet.

sramam commented 4 years ago

yes - you caught that one quick!

They are serial and async.

Each test is async, because it allows multiple suites to run in parallel. Each test suite is serial, since they manipulate state on disk in a coordinated fashion - this allows me to use one test.before() to setup, overall, reducing run time by a very large factor.

novemberborn commented 4 years ago

yes - you caught that one quick!

They are serial and async.

Each test is async, because it allows multiple suites to run in parallel. Each test suite is serial, since they manipulate state on disk in a coordinated fashion - this allows me to use one test.before() to setup, overall, reducing run time by a very large factor.

I've opened https://github.com/avajs/ava/issues/2421 to track. I'm going to mark our conversation here as "off-topic", so it doesn't distract from what this issue is hoping to address.

csvn commented 3 years ago

I just ran into this, and it took longer than I'd like to admit to realize this was the issue. We have two tests that are much slower (~2m), while the other tests are below the default 10s. I wanted to only mark to "slow" tests to use a higher timeout, instead of needing to set t.timeout() for every test.

Would be awesome to have this issue solved! 😄

novemberborn commented 3 years ago

@csvn would you be interested in taking this on?

stefanpl commented 3 years ago

Any news on this? @novemberborn @csvn ?

I'd be willing to draft a PR if nobody is on it yet. But honestly I still don't grok the reason behind the current implementation. Like the docs say, "Timeouts in AVA behave differently than in other test frameworks." – and I fail to understand why that's desirable.

To me:

Having a test fail after 10 seconds although I explicitly set t.timeout(20 * 1000) on the test is highly unexpected.
Having tests (without individual timeouts) possibly running longer than the globalTimeout without failing is highly unexpected.

As a user of the framework, not interested in implementation details, I would expect (similar to @sramam I think):

each test by default has timeout = globalTimeout = 10s
each test's timeout can be overridden with t.timeout
if any test exceeds its individual timeout, it fails

@novemberborn could you maybe try to shed some more light on the design decisions behind the timeout mechanisms? Sorry for being a bit slow here … I have read now that the globalTimeout option is to "stop tests in case they're stuck", or "handle stalled tests". But why does that not work with individual timeouts?

Individual timeouts can be useful in certain cases, but far from all.

Could you please elaborate on this one?

novemberborn commented 3 years ago

@stefanpl the solution outlined in https://github.com/avajs/ava/issues/2384#issuecomment-579678760 addresses your concerns. PR still welcome.

I don't think hard timeouts are necessary for most use cases. Test performance is easily influenced by the load on the executing machine.

rysolv-bot commented 3 years ago

apowerful1 has contributed $40.00 to this issue on Rysolv.

The total bounty is now $40.00. Solve this issue on Rysolv to earn this bounty.

OhYash commented 3 years ago

Hey there, I came in after seeing the issue on Rysolv.

Seeing nobody else has announced of them picking this up, guess I'll give it a try.

As per my understanding after some local testing: Presently, t.timeout(ms) allows us to set up a smaller per-test timeout within the global timeout limit and not beyond it, we need it to allow longer than the global timeout.
I'm going through the code as I write this. And #2384 comment makes sense looking at api.js and test.js.

novemberborn commented 3 years ago

@OhYash great! Let me know if I can help.

OhYash commented 3 years ago

Hey @novemberborn, gave it some more time today. I could use some help actually.

So, we have this restartTimer debounce() function that has the single delay to it, which emits state change upon the timeout and adds pending workers to timedOutWorkerFiles.
Also, that it Maps the files and has workers running then.
What I'm facing problem with is that the workers are mapped to individual files. There will be multiple tests in each file, and now I can't seem to connect the workers to the Test/Runnable objects so as to get the individual timeout from.

novemberborn commented 3 years ago

I think you'd have to add a callback or event emitter to Test.

Runner creates them, so can set it up:

Runner itself is an emitter, so you can forward the events to the main process / thread:

(I don't think this would be considered a state change, but also it's already overloaded so I guess it's fine.)

You can now connect it up to the API code, where you have access to the timeouts:

https://github.com/avajs/ava/blob/b8f5685ed5b828de884d0d72c988efe837058692/lib/api.js#L234

OhYash commented 3 years ago

Hey @novemberborn, sorry for being so silent. Wasn't my intention to ghost away. Just got carried away in some stuff.

So regarding this, I've managed to get the individual timeout into api.js, but I'm struggling to update the wait value in debounce().
Having to resort to creating a fresh debounce() function when a t.timeout() is called doesn't seem correct.

Any way I can update the debounce wait?

OhYash commented 3 years ago

Besides that I think I also see an issue, kinda unrelated. Tests can run asynchronously, but timers are synchronous.

For example: Forget about the `t.timeout()`; In this file one async test takes 6 secs to complete and the other takes 15. Now our test runner starts the first timer and runs both the tests so the second one has a head start before it's timer even starts.	T	Event
0	Timer starts, both tests start
6	First test finished, timer stops. Second timer starts (10s)
15	Second test finished, second timer was at 9s at the moment. Both tests have passed successfully.

gist: https://gist.github.com/OhYash/7f34ac6701a699d5e09537b4e6a054e9

novemberborn commented 3 years ago

@OhYash could you share a draft PR?

OhYash commented 3 years ago

@novemberborn, Linked https://github.com/avajs/ava/pull/2758

avajs / ava