Open andrewrk opened 4 months ago
I think this would require us to have async for something like cancelawait
or spawn a manager thread. With status quo, how do you plan to time out the function in a single-threaded environment?
@Rexicon226 I believe the build system already spawns subprocesses for testing in parallel - a spawned test
process could communicate the timeout requirement back to the parent build (/ test server) process, which could then enforce it. (EDIT: actually stated in OP that the build system is responsible for this)
Alternatively, the timing requirement could also be checked at any point during the test (f.e. when calling std.testing
functions) and trigger an error (or even direct process exit) if elapsed.
That said, even if we didn't cancel the test directly and instead only fail after letting it complete (EDIT: turns out also stated in OP),
I think this system could still empirically be used to both catch regressions and establish baseline timings of tests to optimize upon.
Therefore it still helps towards the stated goal of helping keep unit test times below an acceptable threshold.
a spawned test process could communicate the timeout requirement back to the parent build (/ test server) process,
This is what I was referring to. This test
process runs on a single thread (usually), and has no way of interrupting the unit tests while they are running (or hanging). I suggested a potential solution could be cancelawait
inside of the test runner, but I'm curious what other people have in mind.
We could set a time limit for the entire test runner process but this is a bad idea because we will loose information about which test it was that timed out.
a spawned test process could communicate the timeout requirement back to the parent build (/ test server) process,
This is what I was referring to. This
test
process runs on a single thread (usually), and has no way of interrupting the unit tests while they are running (or hanging).
AFAIU that would be the responsibility of the parent zig build
process - it can simply kill the respective child zig test
process.
(As stated, zig test
spawned directly may instead complete the test before reporting it as a failure.)
Do you see an issue with this? I imagine it already does the same thing on receiving a SIGKILL or terminal interrupt, so I don't think we would be introducing a new mechanism here.
The issue that I see is that we would not know which test was the one hanging (timing out).
The issue that I see is that we would not know which test was the one hanging (timing out).
From how I interpret lib/compiler/test_runner.zig (looking at it for the first time), every single test execution is requested by the parent process.
So the build
process already knows which test it requests a particular child process to run, and that will be the hanging test if it times out.
We already have to add a new message type for communicating the expected time limit,
plus I assume we can freely modify the test-server communication protocol, since it's an implementation detail internal to the Zig toolchain.
That's an interesting idea, we could have the "latest" test being ran and we'll know if it times out. In the future I would like have this ability in the test runner itself, zig test
, but I can't think of a way to do this without async. With the test-server, I agree that the build runner just killing the test process is the best idea.
To add, I believe we'll need a message type for the amount of time the test spent as well, to have this --test-timeout-diagnostics
flag. My only concern there is that users might mistake this for benchmarks, but that is a small concern.
A real world case I personally have is in the RISC-V backend, it's not uncommon for tests to go into an infinite loop because of incorrect codegen, and I have to manually set a timeout in my wrapper script.
I propose two enhancements that, together, are aimed at helping developers keep total unit test times below an acceptable threshold.
The first enhancement is to have each unit test have a time limit. This would default to 1 second, and any test could change its own time limit with API such as
std.testing.setCurrentTimeLimit(2.0)
. When runningzig test
orzig build
there would be a flag to scale this number. For example,--test-timeout-scale=2.0
would give twice as much time to each unit test. Idea here is that you would set this value on hosts significantly slower or faster than normal.When the time limit of a unit test is exceeded, the build system will kill the misbehaving test and report it as a timeout.
zig test
would run forever and then report a timeout if it took longer than expected.The second enhancement is a flag for discovering the relative durations of unit tests. Something like
--test-timeout-diagnostics
which would display a sorted list of unit tests by how long they took to run.When this flag is run at the build system level, it would aggregate the data across all sets of unit tests.
This is just a fun side benefit, but
--test-timeout-diagnostics
is what suggestion would be given to folks asking "how do I disable the cache when running tests?" which would encourage, let's be honest, people who are probably doing naughty things in their unit tests, to at least think about how long they are taking to run.