testsuite failed when nextest runs

rust-lang / libtest-next

T-testing-devex experiments for new testing tools for Rust developers

https://docs.rs/pytest-rs

Apache License 2.0

56 stars 4 forks source link

testsuite failed when nextest runs #38

Closed zjp-CN closed 1 year ago

zjp-CN commented 1 year ago

https://github.com/nextest-rs/nextest/issues/874

It seems we need to isolate test build directories, or use test groups to ensure mutual exclusion according to the linked issue.

I think reusing the same directory to test in parallel causes the conflict here?

https://github.com/epage/pytest-rs/blob/eb275ecb33239294aadf15dbb5c66f38ac0a6b67/crates/libtest2-mimic/tests/testsuite/util.rs#L60-L70

epage commented 1 year ago

FYI I consider this a bug within cargo nextest. Tests today are allowed to share state and one of my design goals is to make that easier, not prevent it. cargo nextest should not be running every test within an isolated process.

zjp-CN commented 1 year ago

Agreed. I'll leave a note here for reference.

nextest model is not appropriate for libtest2-mimic::testsuite:

testsuite is sequentially run by cargo test. For cargo test --tests -- mixed_bag panic in libtest2-mimic, the tests are generated in projectroot/target/tmp/libtest2_mimic_target/t0 and projectroot/target/tmp/libtest2_mimic_target/t1 where t0 = mixed_bag and t1 = panic (or vice versa). And importantly cargo test will reuse the artifacts which are already cached
however cargo nextest run -- mixed_bag panic will not cache the artifacts: it instead always generates each single test under ....../t0. If mixed_bag has N tests, nextest will generate the same artifact N times in parallel for mixed_bag.

The result is data race and slower execution via nextest, which is bad.

sunshowers commented 1 year ago

Hi Ed! Note that nextest really needs each test to be run in an isolated process to provide its features (e.g. not having to run test binaries sequentially, which is key to why nextest is so much faster than cargo test). There is a proposal to make tests run within the same process, and it should be doable with a lot of work, but it will be opt-in: nextest will not change its default from test-per-process. Importantly, many projects today have adopted nextest and issues are relatively rare, especially now that nextest supports test groups.

It would be unfortunate if libtest2-mimic were to go in the opposite direction, but I totally understand why.

sunshowers commented 1 year ago

I also wanted to mention that many projects now depend on nextest's execution model to run individual tests in an isolated manner. For example:

wgpu mucks around with graphics-related state that can't be isolated except at process boundaries. So its test suite inherently requires that individual tests are run in separate processes.
Similarly, there's a library that embeds bash somewhere and they use nextest as well.

As such I consider nextest's process isolation to be a feature, not a bug. In my experience the cases where you need per-test process isolation vastly outnumber the cases where isolation gets in the way.

epage commented 1 year ago

@sunshowers this issue is only about the tests for this repo (i think) and not for any specific project when depended on. However, I am not going to design for cargo-nextest today. Personally, cargo-nextest's core design to me comes across as a hack and to fix problems with that hack, we need to put more hacks on top. Don't get me wrong, I think this has been very valuable for getting people features now, for experimentation, for raising people's sights, and for supporting features that will likely never be in cargo.

You might say the approach I'm taking is to find how to stablize cargo-nextest features. I'm starting with libtest for a couple of reasons but one is we need to start with stablizing the json output before we rework how cargo interacts with libtest, including parallel test running. My overall model is to treat libtest binaries like rustc, communicate back through json and use jobserver for managing test threads across binaries (there is a cargo issue for us to continue this point but I'm on mobile).

As for tests that need to be serialized, my local branch for benchmark support allows any test, not just benches, to opt-in to being exclusive. I also expect to have my top-level harness to expose that as #[test(exculsive)] and to also have an attribute for process isolation.

In this future state, I see cargo-nextest being good for features like "re-run last failure" and usability and configuration heavy features.

My RustNL talk last week was the first time I talked about this more broadly and this is still in the early days. For example, I need to map out in more detail what was discussed in the libs meeting on Saturday about this.