Can we centralize all test running to `cargo test`?

epage commented 1 year ago

@_Charles Lew|116458 said:

The features laid out are great, however i'd want to point out the current top-level approach might not be the best in the long run: Currently unit testing is implemented with a compiler test harness + libtest approach. This works but only under the assumption that the host can run the target executable, which is not the case, especially considering cross-platform scenerio, and the wasm target (see wasm-bindgen-test).

The better approach would be compile to generate a dylib equivalent and always using a test runner to execute that, which can use any capability to run it, including, wine to run windows tests from unix, wasm-bindgen-test-runner to run wasm tests from any platform, qemu to simulate embedded and niche platforms, etc. cargo test can just invoke the proper test runner to execute the tests.

Under this vision, #[test], #[bench]` etc would just be minimal protocols that the test runners can understand(maybe with a shared crate to allow reuse between test runners).

epage commented 1 year ago

I saw this discussed somewhere else and wish I could find it at the moment.

The challenges with this approach is that it increases the API surface area (have to stabilize all types used in the interface). This might not seem so bad for libtest today but that would be prohibitive for the goal of a pytest-like custom test harness.

epage commented 1 year ago

I suspect the best route forward would be for this to be developed as two separate custom test harnesses (one for the host-side binary, one for the dylib binary). This would unblock people while being able to experiment with what such an API could look like.

My best guess is we'd need something like my original plan for libtest (define extension points that allow a pytest-like harness). For this specific case, the dylib API would not expose test functions but "test generation" functions. This would allow a lot more complex operations, like what a pytest-like API would need, to exist within the test generator.

A benefit to this is we could do process isolation for test generators. This would help with catching certain classes of bugs and capturing stdout/stderr. A simple test harness could perfectly do this on a per-test basis while a more complex one would still at least get some benefit.

I suspect the first step to vet test generators to stabilize a dylib API around them would be to keep with the plan as-is.

rust-lang / libtest-next

Can we centralize all test running to `cargo test`? #43