Tracking issue for eRFC 2318, Custom test frameworks

Centril commented 6 years ago

This is a tracking issue for the eRFC "Custom test frameworks" (rust-lang/rfcs#2318).

Documentation:

Steps:

[ ] Implement the RFC (cc @rust-lang/dev-tools @rust-lang/cargo -- can anyone write up mentoring instructions?)
[ ] Adjust documentation (see instructions on forge)
[ ] Stabilization PR (see instructions on forge)

Unresolved questions:

Notes:

Destabilization of #[bench]: https://github.com/rust-lang/rust/issues/63798#issuecomment-526593504

Implementation history:

Initial implementation: #53410

e-oz commented 5 years ago

@e-oz it sounds like your issue is the result of a library you consume taking advantage of unstable rust features, which is not allowed outside of nightly. This is a long-standing policy of the compiler to have nightly-only features to allow for experimentation with APIs before they are stabilized.

If #[bench] was working on stable or beta rust then it was a bug, and it got fixed. If this happened because you moved from nightly to beta then it is expected behavior. You can solve your problem by going back to nightly, though in general I don't recommend depending on unstable nightly features as they are subject sudden change.

No, I'm not retarded and I understand difference between nightly and beta. I never was using nightly.

e-oz commented 5 years ago

@e-oz please keep discussion on-topic and civil. Nothing being discussed here involves breaking changes as far as I can see. If there are breaking changes, we appreciate them being pointed out as we do not wish to break backwards compatibility.

I quoted message I got from compiler.

killercup commented 5 years ago

"use of unstable library feature 'test': bench is a part of custom test frameworks which are unstable"

And it's not even inside my code (library I use have this error), so I can't change it.

I quoted message I got from compiler.

As far as I can tell, you've not given enough details, not even the name of the library.

@e-oz, please try to report issues in a civil way. We are not here to guess what errors you have nor are we likely to spend our time on any of it when you comment in that kind of tone.

e-oz commented 5 years ago

@killercup you don't need to guess anything, I quoted text of the error for you. One more time:

use of unstable library feature 'test': bench is a part of custom test frameworks which are unstable

e-oz commented 5 years ago

Also, everything I wrote is pretty "civil". 0 insults so far.

If you expect me to be "happy" about not being able to compile project without rolling back to 1.37 - do not expect it, I will not write my messages in "positive" mood. It's not just "hobby" project, it's my job, so yes I'm very worried that some day I will not be able to update Rust version just to be able to compile this binary. And it's too huge to rewrite, so no, I can't stop using that library or replace it, so name of the library doesn't matter.

I'm glad you have no such issues. Good for you. I'm not asking you to fix something or to implement something - just to don't break what is working.

Manishearth commented 5 years ago

@e-oz there is not nearly enough information for us to figure out what the regression is. As far as we can tell, this should not happen on code that used to compile. It's totally possible that there's a bug causing a regression, but without further details we cannot help. You have been asked for details once already and you had an unhelpful response.

Your tone is very hostile, and you are being curt and unhelpful. This will lead to moderation action if you continue down this tack.

Manishearth commented 5 years ago

Also, this is a tracking issue, please file a separate issue for this and link it there. Please provide instructions for reproducing the regression.

e-oz commented 5 years ago

@Manishearth

As far as we can tell, this should not happen on code that used to compile

But it does.

but without further details we cannot help

And I should guess what details you need?

and you had an unhelpful response

If compiler's message is unhelpful - it's not my fault.

Your tone is very hostile

This bug created real issue for me and you just prefer to ignore it with response "it should work", so what tone you expect? I should be happy about this response? You've marked all my messages as offtopic, just to show how much it "matters" for you, so what you expect from me? Begging you? I don't care about your "moderation action", I clearly see you don't want to fix breaking change you brought so I have no reason to talk with you.

What I learned is I can't trust your promises and code can just stop compiling one day.

Please provide instructions for reproducing the regression.

It took few seconds to check what is written in compiler's message:

https://play.rust-lang.org/?version=beta&mode=debug&edition=2018&gist=b006497caba8892a0350a517d49c7a2d

It compiles in stable and doesn't compile in beta mode.

Manishearth commented 5 years ago

But it does.

Like I said, it's totally possible that there's a bug, but we can't figure that out without further information.

You've marked all my messages as offtopic

Yes, I told you to file a separate issue. Further comments here will be deleted.

You have provided enough information, but please do it in an issue form.

(I'd file it myself but I'm on a train right now)

CAD97 commented 5 years ago

I created an issue for the #[bench] regression at #63798. Please direct any further discussion there.

In the future, if code stops compiling between stable and beta, create a minimized example and open an issue, rather than seeking out the linked tracking issue and just posting the error without context. It's almost always a bug if something stops compiling between stable and beta. Nobody will complain about an issue making the fact known. But we can't diagnose what broke without a reproduction.

e-oz commented 5 years ago

@CAD97

rather than seeking out the linked tracking issue

Look at the compiles message please, notice the URL:

Compiling playground v0.0.1 (/playground)
error[E0658]: use of unstable library feature 'test': `bench` is a part of custom test frameworks which are unstable
 --> src/main.rs:5:3
  |
5 | #[bench]
  |   ^^^^^
  |
  = note: for more information, see https://github.com/rust-lang/rust/issues/50297

and just posting the error without context

I was able to reproduce an error just using that message. It took few seconds literally. Yes, I had no context also, it's not my code.

But we can't diagnose what broke without a reproduction.

Come on, nobody tried, response was "it should work, get lost".

Manishearth commented 5 years ago

The response was "we need more information".

If you keep commenting on this issue on this topic this will lead to a temporary ban from GitHub. Do with this information what you wish.

e-oz commented 5 years ago

I don't give a fuck about your threats :) I was able to reproduce without additional information, your response was "it should work".

gz commented 5 years ago

Is there a plan or way to integrate the #[should_panic] attribute with custom test runners and #[test_case] or is this out of scope for this eRFC?

djrenren commented 5 years ago

@gz for sure! In fact, #[should_panic] is already, currently implemented using custom test runners and #[test_case]!

Here's how it works:

First off, because we want to react to a panic, we need to compile with panics as unwinds (as opposed to aborts). This is the default behavior of the compiler, but you can specify otherwise in Cargo.toml. I believe the compiler forces this mode for test compilations (which it shouldn't and we should fix that).

Now, we need a test runner that knows how to react to panics. Enter libtest::test_main_static. This is the test runner that rustc uses by default. It receives each test case as a TestDescAndFn. After a bit of setup, it invokes TestDescAndFn::testfn using std::panic::catch_unwind [src], which will allow the test runner to respond to the fact that a test failed.

All that's left is informing that this test was in fact meant to panic. This is encoded within TestDesc which is available at TestDescAndFn::desc. So all we need to do is set desc.should_fail = true on tests that should fail.

Enter the #[test] macro which takes as input a simple a test function like so:

#[test]
fn my_test() -> {}

and transforms it into a TestDescAndFn that's annotated with #[test_case].

If #[test] sees that that the underlying function is annotated with #[should_panic] then it will output a TestDescAndFn where should_fail is true.

Notably, every step of this approach (aside from #![test_runner] and #[test_case]) relies on stable APIs. This means, were CTF to be stabilized you could recreate the exact behavior.

idubrov commented 5 years ago

@gnzlbg, it seems you worked on libtest/extracting libtest? Can you maybe give a quick summary of the state of this?

Just for the reference / context (reasons why libtest extraction was reverted): https://github.com/rust-lang/rust/pull/59766#issuecomment-480586326

Also, @djrenren should your approach be put forward as a formal RFC? I don't really know anything about the process and what's happening with this custom testing framework RFC, but your proposal seems practical and something that could actually be worked towards to?

Manishearth commented 5 years ago

To be clear: removing libtest is something we're still interested in, just that the way it was carried out the first time had a bunch of unintended consequences and broke stuff.

It might be good to try again, but start with moving compiletest out of tree instead of libtest (and making it depend on out-of-tree libtest). If that's fixed then the next steps are much easier and less likely to break everything :smile:

gnzlbg commented 5 years ago

@alexcrichton simplified the build process of rust significantly recently, so retrying the split might now work - not sure. Moving compiletest out of tree is worth doing anyways, and would help. Having to deal with two slightly different compiletests is a bit of a pain.

djrenren commented 5 years ago

@gnzlbg I just took a stab at it and it seems like an extracted compiletest builds well off a newly extracted version of libtest. I have no idea how to hook that back into the rust build process and clippy and such, but it at least extracts cleanly!

Here's a fully extracted compiletest. Though we should really update the external libtest. https://github.com/djrenren/compiletest

djrenren commented 5 years ago

I've created a PR for externalizing compiletest: https://github.com/rust-lang/rust/pull/63929

gnzlbg commented 5 years ago

Thanks, part of the extraction is merging that somehow with compiletest-rs in crates.io. I'll comment on the PR.

ligurio commented 5 years ago

Standardizing the output

We should probably provide a crate with useful output formatters and stuff so that if test harnesses desire, they can use the same output formatting as a regular test. This also provides a centralized location to standardize things like json output and whatnot.

Please consider standard test output formats like Test Anything Protocol (spec), SubUnit (spec) and xUnit (spec). These formats have open specifications, widely used (especially xUnit and TAP) and already supported in many test report systems and continuous integration systems. There is a comparison table for all three formats: https://github.com/ligurio/testres/wiki/Everything-you-need-to-know-about-software-testing-report-formats#comparison-table

gilescope commented 4 years ago

Hi, in 2017 I asked if we could have team city service messages output for running cargo tests. I just thought I'd check back and see if this is now possible? Currently there's some nasty regex in teamcity's rust plugin - I think and that could get rewritten to use the json output instead, but I wondered if it was now possible to plug in custom test reporters as well as custom test runners (not everyone has the ability to install arbitrary teamcity plugins - you'd need to be an admin for that.). I would like testing rust to be easy in Teamcity as lots of people use it in the non-open source world (because it works very well).

UPDATE: cargo test -- -Zunstable-options --format json --report-time streams the information needed. That's enough to be able to build a wrapper around cargo test.

dvtkrlbs commented 3 years ago

Are there any progress on this ?

danakj commented 2 years ago

Hello, I just wanted to give the feedback (on request of @Manishearth) that I was hoping to use the custom test frameworks for hooking Rust unit tests into the C++ Gtest framework, however I am not able to. The problem is simply that:

rustc --test still has the compiler produce its own main() function.
The set of #[test] functions is only known to the compiler, and it inserts them into the generated main() function.

To use the custom test framework in our project, we would need to use our own existing C++ main() functions. What we need from the compiler is the list of test functions, and their attributes, so that we could then register them all with Gtest before invoking the test framework. Likely this would have to look like writing a macro where the inputs are the functions? And the default implementation of that macro writes a main() function that invokes them all in the rusty way. For us, it would generate some "register_everything()" at the root of the crate, which we could call from main().

Instead, I am falling back on introducing static initializers which do that registration, which is unfortunate. If you're interested I am doing this work in https://bugs.chromium.org/p/chromium/issues/detail?id=1293979.

djrenren commented 2 years ago

Ah interesting, you want to basically build tests as a library. Ultimately, the magic of #[test] in rust is, as youve realized, is breaking lexical scope and aggregating definitions from throughout the crate. At the time I built the custom test framework experimental implementation, there was a lot of resistance to supporting this primitive directly. Basically the concerns were that we shouldn’t incentivize people to break the visibility rules of the language. Furthermore, incremental compilation was new and we weren’t sure if such a primitive, if popular in libraries, would slow down builds across the ecosystem. And so we tried to build this specialized system on top of that primitive so that it was basically only useful for tests. Several years on now, I think it’s worth re-assessing this stance. In my opinion, ctor is spookier to me than a #[aggregate] built-in. Idk who to ask about that these days.

On Feb 17, 2022, at 18:26, Dana Jansens @.***> wrote:

Hello, I just wanted to give the feedback (on request of @Manishearth) that I was hoping to use the custom test frameworks for hooking Rust unit tests into the C++ Gtest framework, however I am not able to. The problem is simply that:

rustc --test still has the compiler produce its own main() function. The set of #[test] functions is only known to the compiler, and it inserts them into the generated main() function. To use the custom test framework in our project, we would need to use our own existing C++ main() functions. What we need from the compiler is the list of test functions, and their attributes, so that we could then register them all with Gtest before invoking the test framework. Likely this would have to look like writing a macro where the inputs are the functions? And the default implementation of that macro writes a main() function that invokes them all in the rusty way. For us, it would generate some "register_everything()" at the root of the crate, which we could call from main().

Instead, I am falling back on introducing static initializers which do that registration, which is unfortunate. If you're interested I am doing this work in https://bugs.chromium.org/p/chromium/issues/detail?id=1293979.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Manishearth commented 2 years ago

I'll note that the original version of the test frameworks RFC was designed to be like a whole-crate proc macro, with the main RFC work being that Cargo integration would be provided but the rest would be cobbled together from other crates (likely with a published crate implementing the basic Rust testrunner with hooks).

Perhaps it's worth trying that instead? Overall I feel like a lot of the use cases for this feature will need this level of flexibility.

https://github.com/Manishearth/rfcs/blob/53ba19d267244275c98a871ce894411db530bca7/text/0000-erfc-post-build-contexts.md#procedural-macro-for-a-new-post-build-context

djrenren commented 2 years ago

Yeah the whole crate macro is definitely a viable option, but I’m not sure they enable much that couldn’t be accomplished by combination of aggregation and proc macros. But both seem reasonable to me.

On Feb 17, 2022, at 20:13, Manish Goregaokar @.***> wrote:

I'll note that the original version of the test frameworks RFC was designed to be like a whole-crate proc macro, with the main RFC work being that Cargo integration would be provided but the rest would be cobbled together from other crates (likely with a published crate implementing the basic Rust testrunner with hooks).

Perhaps it's worth trying that instead? Overall I feel like a lot of the use cases for this feature will need this level of flexibility.

https://github.com/Manishearth/rfcs/blob/53ba19d267244275c98a871ce894411db530bca7/text/0000-erfc-post-build-contexts.md#procedural-macro-for-a-new-post-build-context

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Manishearth commented 2 years ago

@danakj I guess y'all could require having a gtest_init!() macro at the top that adds a #![no_main] and some initializer code? Unsure if that would work. This is roughly how cargo-fuzz works though it doesn't use custom test frameworks.

danakj commented 2 years ago

Perhaps, but it would need to be at the crate root, and need as input a list of all the test functions, file!(), line!(), and their attributes (in order to read the test suite+name). Or some custom structure derived from an attribute macro on each test function.

Just skipping main isn't enough, and maybe I am not being imaginative enough at what the gtest_init!() macro is able to do.

djrenren commented 2 years ago

Nah I think your intuition is right. Unless you could somehow invoke the generated main

On Feb 18, 2022, at 12:15, Dana Jansens @.***> wrote:

Perhaps, but it would need to be at the crate root, and need as input a list of all the test functions, file!(), line!(), and their attributes (in order to read the test suite+name). Or some custom structure derived from an attribute macro on each test function.

Just skipping main isn't enough, and maybe I am not being imaginative enough at what the gtest_init!() macro is able to do.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

Manishearth commented 2 years ago

oh, good point, yeah

phil-opp commented 2 years ago

Maybe something like this could work:

#![feature(custom_test_frameworks)]
#![feature(test)]

#![test_runner(crate::test_runner)]
#![reexport_test_harness_main = "test_main"]
#![cfg_attr(test, no_main)]

extern crate test;

// Normally defined in your C++ code
#[cfg(test)]
#[no_mangle]
fn main() {
    run_rust_tests();
}

// The C++ code will call this function
#[cfg(test)]
#[no_mangle]
pub extern "C" fn run_rust_tests() {
    test_main();
}

// Invoked by the generated `test_main` function 
#[cfg(test)]
fn test_runner(tests: &[&test::TestDescAndFn]) {
    println!("Running {} tests", tests.len());
    for test in tests {
        print!("{}...", test.desc.name);
        match test.testfn {
            test::TestFn::StaticTestFn(f) => f(),
            test::TestFn::StaticBenchFn(_) => todo!(),
            test::TestFn::DynTestFn(_) => todo!(),
            test::TestFn::DynBenchFn(_) => todo!(),
        }
        println!("[ok]");
    }
}

#[cfg(test)]
mod tests {
    #[test]
    fn it_works() {
        let result = 2 + 2;
        assert_eq!(result, 4);
    }
}

Instead of running the tests, you could also only collect the test metadata that you need in a static and report it back to your C++ code. For example, to report the number of available tests:

static mut TEST_NAMES: Vec<String> = Vec::new();

#[no_mangle]
pub extern "C" fn run_rust_tests() -> usize {
    test_main();
    unsafe { TEST_NAMES.len() }
}

fn test_runner(tests: &[&test::TestDescAndFn]) {
    unsafe { TEST_NAMES = tests.into_iter().map(|t| t.desc.name.to_string()).collect() };
}

melynx commented 2 years ago

Ah interesting, you want to basically build tests as a library. Ultimately, the magic of #[test] in rust is, as youve realized, is breaking lexical scope and aggregating definitions from throughout the crate. At the time I built the custom test framework experimental implementation, there was a lot of resistance to supporting this primitive directly. Basically the concerns were that we shouldn’t incentivize people to break the visibility rules of the language. Furthermore, incremental compilation was new and we weren’t sure if such a primitive, if popular in libraries, would slow down builds across the ecosystem. And so we tried to build this specialized system on top of that primitive so that it was basically only useful for tests. Several years on now, I think it’s worth re-assessing this stance. In my opinion, ctor is spookier to me than a #[aggregate] built-in. Idk who to ask about that these days.

I second this. Allowing functions to be aggregated provides for maximum flexibility. I'm working on some no_std shared objects which have a different architecture from the runner. I wanted to just have custom_test_frameworks expose the runner test harness "main" as an exported symbol but it seems that building test assumes that it has to be compiled as an executable.

joshtriplett commented 2 years ago

This experiment has been open since 2018. It doesn't look like there's been recent development activity on it.

I'm going to close this for now. If someone wants to restart this effort, one start would be a lang MCP for a general mechanism for collecting an array of items of the same type from across the program, based on attributes. That mechanism would be generally useful for various things, including tests, tracing, and other annotations.

@davidbarsky expressed some interest in working on that MCP.

shepmaster commented 2 years ago

MCP for a general mechanism for

What about replacing stdout / stderr like the built in test framework? Using something like inventory, I've gotten good enough support for collecting an array of items for my own nascent test framework. What I don't have is a good way to prevent parallel tests from combining output.

Manishearth commented 2 years ago

Honestly at this point I would strongly argue for my original proposal, which was a more proc-macro-esque API that would require minimal changes to the compiler and farm out all the hard stuff to libraries. The main thing that cargo test gets that you can't on your own is not endpoint-collection (you can do that with a macro), it's cargo being able to figure out the correct rustc invocation.

I feel like the lack of interest in dealing with the large amount of design work to get the eRFC as currently written over the hump is a pretty major signal. Plus we have documented use cases in this thread where it's insufficient.

I don't have time to go through this again, but if someone is interested I'm happy to help out. I imagine we would still need an MCP but the actual changes would be largely on the cargo side.

Ezrashaw commented 1 year ago

@joshtriplett @Manishearth I'm interested in restarting this proposal, do you have any advice?

Manishearth commented 1 year ago

Not much more than go through the RFC and the current code (and any PRs linked to this issue) and see what's missing. I don't remember the status quo unfortunately. I think there's nothing.

If you would like to repropose my original proposal that I mentioned above, I think it can be found in the RFC PR's commit history.

gilescope commented 1 year ago

I think the biggest change in rust testing has been the arrival of nextest on the scene.

On Sat, 18 Mar 2023 at 15:11, Manish Goregaokar @.***> wrote:

Not much more than go through the RFC and the current code (and any PRs linked to this issue) and see what's missing. I don't remember the status quo unfortunately.

— Reply to this email directly, view it on GitHub https://github.com/rust-lang/rust/issues/50297#issuecomment-1474874808, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGEJCB2MXQY72GF6BS5TCLW4XGAJANCNFSM4E5NDDDA . You are receiving this because you commented.Message ID: @.***>

asomers commented 1 year ago

I think the biggest change in rust testing has been the arrival of nextest on the scene.

cargo-nextest really doesn't address this issue. It's just a different runner that can execute the same test. But it doesn't allow you to alter the design of the tests themselves, for example by creating test functions at runtime, or deciding at runtime that a particular test function should be skipped.

Ezrashaw commented 1 year ago

Yes, although I think @gilescope is talking about changes to the Rust testing ecosystem generally.

amab8901 commented 1 year ago

if this issue is closed, then maybe this (https://doc.rust-lang.org/core/prelude/v1/macro.test_case.html) should no longer be labeled as "nightly" and "experimental"?

mzabaluev commented 1 year ago

if this issue is closed, then maybe this (https://doc.rust-lang.org/core/prelude/v1/macro.test_case.html) should no longer be labeled as "nightly" and "experimental"?

Some crates already exploit the test_case attribute in stable Rust, so the effects need to be documented at least.

ctz commented 5 months ago

I was looking at using datatest -- it seems to address my needs to produce test cases from json. However, it requires custom_test_frameworks which is nightly-only.

And this being the tracking RFC for that feature, what does it mean for it to be closed? Is this effectively nightly-only, forever? Will it be removed? Is it destined for the same purgatory as #[bench]?

If this is permanently abandoned, it would be good if this issue, and the unstable book recorded that in a clear way.

jdonszelmann commented 4 months ago

I just wanted to link this tracking issue here, as it's somewhat related: https://github.com/rust-lang/rust/issues/125119

HernandoR commented 2 days ago

I followed unstable book of test::Bencher to here, after browsing dozens of issues, may i conclude that crate test is unstable, hence we should open to discussion here? also, what is unstable and needs to be discussed and decided to stabilize test lib?

weihanglo commented 2 days ago

See https://github.com/rust-lang/testing-devex-team/issues/2

rust-lang / rust

Tracking issue for eRFC 2318, Custom test frameworks #50297