Closed detly closed 1 year ago
I've narrowed it down a bit and found that while some tests (in calloop-subproc
) complete in 200s, others take more than an hour (I still haven't found how long):
200 test_chain_single_failure
200 test_chain_single_success
? test_chain_multiple_success_with_failure_and_cleanup
? test_chain_multiple_success_fail
? test_chain_multiple_success_with_failure_and_cleanup_failure
? test_chain_multiple_success_with_cleanup_failure
? test_chain_multiple_success
202 test_listen_one_line
199 test_listen_three_lines
So, here is a chance the current PR I'm working on will greatly improve this https://github.com/xd009642/tarpaulin/pull/962 if you want to try it out. It changes that part of the code a lot and right now I'm just fighting with one intermittently failing test on nightly. If you're interested in some of the why then this comment explains a lot of the details https://github.com/xd009642/tarpaulin/issues/953#issuecomment-1059982528
Once I've got this PR done and merged I'll look at this issue and see what the difference is (or maybe before, just once I've got the flaky test sorted). And hopefully come back to this issue with some updates
Okay, I managed to substantially reduce the repro, it seems like having multiple async process spawns, even if they're sequential, just totally stops under tarpaulin. I get this output from the test repo linked above:
tarp-test ⚬ cargo tarpaulin --ignore-tests --timeout 120 --forward -- --nocapture
Mar 08 17:35:49.125 INFO cargo_tarpaulin::config: Creating config
Mar 08 17:35:49.151 INFO cargo_tarpaulin: Running Tarpaulin
Mar 08 17:35:49.151 INFO cargo_tarpaulin: Building project
Mar 08 17:35:49.151 INFO cargo_tarpaulin::cargo: Cleaning project
Compiling [etc etc]
Finished test [unoptimized + debuginfo] target(s) in 27.73s
Mar 08 17:36:16.969 INFO cargo_tarpaulin::process_handling::linux: Launching test
Mar 08 17:36:16.969 INFO cargo_tarpaulin::process_handling: running /home/jason/Code/tarp-test/target/debug/deps/test_one-036ad9d4be9c1c80
running 1 test
Spawning first process
Awaiting first process
test test_one has been running for over 60 seconds
Mar 08 17:38:21.793 ERROR cargo_tarpaulin: Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response
Error: "Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response"
Oops, crossed signals there. I'll give that branch a try.
Didn't seem to change things I'm afraid. I'm using 9543cb27292c24b1290ce5c61f419663eede2b52 and get:
tarp-test ⚬ ../tarpaulin/target/release/cargo-tarpaulin tarpaulin --ignore-tests --forward --timeout 120 -- --nocapture
Mar 08 18:17:08.440 INFO cargo_tarpaulin::config: Creating config
Mar 08 18:17:08.547 INFO cargo_tarpaulin: Running Tarpaulin
Mar 08 18:17:08.547 INFO cargo_tarpaulin: Building project
Mar 08 18:17:08.547 INFO cargo_tarpaulin::cargo: Cleaning project
Compiling libc v0.2.119
Compiling futures-core v0.3.21
Compiling log v0.4.14
Compiling cfg-if v1.0.0
Compiling memchr v2.4.1
Compiling autocfg v1.1.0
Compiling pin-project-lite v0.2.8
Compiling futures-task v0.3.21
Compiling slab v0.4.5
Compiling signal-hook v0.3.13
Compiling cache-padded v1.2.0
Compiling waker-fn v1.1.0
Compiling futures-util v0.3.21
Compiling parking v2.0.0
Compiling fastrand v1.7.0
Compiling futures-io v0.3.21
Compiling pin-utils v0.1.0
Compiling once_cell v1.10.0
Compiling bitflags v1.3.2
Compiling event-listener v2.5.2
Compiling concurrent-queue v1.2.2
Compiling memoffset v0.6.5
Compiling futures-lite v1.12.0
Compiling socket2 v0.4.4
Compiling signal-hook-registry v1.4.0
Compiling polling v2.2.0
Compiling nix v0.22.3
Compiling async-io v1.6.0
Compiling async-process v1.3.0
Compiling calloop v0.9.3
Compiling tarp-test v0.1.0 (/home/jason/Code/tarp-test)
Finished test [unoptimized + debuginfo] target(s) in 25.74s
Mar 08 18:17:34.565 INFO cargo_tarpaulin::process_handling::linux: Launching test
Mar 08 18:17:34.565 INFO cargo_tarpaulin::process_handling: running /home/jason/Code/tarp-test/target/debug/deps/test_one-036ad9d4be9c1c80
running 1 test
Spawning first process
Awaiting first process
test test_one has been running for over 60 seconds
Mar 08 18:19:37.765 ERROR cargo_tarpaulin: Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response
Error: "Failed to get test coverage! Error: Failed to run tests: Error: Timed out waiting for test response"
So this should now be a lot better with --engine llvm
for llvm coverage instrumentation which was released during 2022. If not feel free to reopen and I'll look again
Just confirming, all I had to do was add that flag and it works perfectly in my example project.
I've written a crate calloop-subproc that uses the Calloop event loop and async-process to manage subprocesses in an event loop. It's about 700 lines of code with 9 integration tests. In order to run the tests, the test binary has to build another test binary that can serve as the "thing to run as a subprocess" part of the tests.
Without tarpaulin, the tests take about 5s including the extra binary build. Under tarpaulin, they never finish. They don't seem to hang, however — I can gradually let more and more tests finish by increasing the timeout from 1m to 2m, to 10m, to an hour. So far, however, I have not found a long enough duration for the whole set of (9) tests to finish.
I have tried to create a minimal example, but without much luck. It really seems to be a result of the following factors used together:
...and possibly other factors?
Nonetheless, my tarp-test project sort-of shows the problem, where a usually-20s test runs in 230s under tarpaulin (including the one-off build time). It certainly doesn't show the same magnitude of slowdown that the original project shows though.
I initially thought it was due to signals for subprocess management (kill, sigchld etc.) being blocked, but the tests themselves seem to actually work, so I doubt it's that after all.
I'll keep trying to see if I can identify a particular factor that's having such a disproportionate impact.