Detect and error on duplicate test cases

ericcornelissen commented 2 years ago

A, at least in theory, fairly straightforward quality of life feature to detect a simple mistakes: having duplicate test cases for a parametrized test.

A trivial example of what I mean (derived from the documentation):

use rstest::rstest;

#[rstest]
#[case(0, 0)]
#[case(1, 1)]
#[case(2, 1)]
#[case(3, 2)]
#[case(3, 2)] // <-- Error: duplicates test case
#[case(4, 3)]
fn fibonacci_test(#[case] input: u32, #[case] expected: u32) {
    assert_eq!(expected, fibonacci(input))
}

I haven't used all aspects of rstest yet, so I'm not sure if and how this would apply to other ways (than the #[case()] attribute) of using rstest.

I had a quick look at the source code and I believe this could be achieved by checking the args variable of the Parse implementation for TestCase for duplicates and returning an Err if there are any.

la10736 commented 2 years ago

Thanks a lot to reporting this... Yes ... sure it could be useful but I can see some trouble on implementing this check: what about if the value is an expression that return the same value? we cannot know the function value a compile time (i.e. could be a random value) or check something like 4 + 1 == 3 + 2.

So.... I don't see a good way to implement this check without any false positive. I way could be introduce an annotation to enable duplicate cases but introduce a new syntax seams too much.

ericcornelissen commented 2 years ago

I'm still trying to wrap my head around how rstest works, but are you saying that at no point during the runtime of rstest does it know the (evaluated) values of expressions? If so, duplicate detection would indeed be rather difficult to implement...

Assuming rstest does know the values: On the surface, even when the cases evaluate to the same value I think I'd like to know about it (and eliminate the duplication). Whether the value is hardcoded or obtained by evaluating an expression seems irrelevant to me - if it's the same test case it's the same test case.

This would admittedly not quite work for randomized expressions and would lead to flaky behavior(/false positives)[^1]. I agree having a way to disable/enable duplicate detection would be necessary. I also agree introducing new syntax is a bit much - I'm wondering if passing argument to #[rstest] could be an option instead (and wouldn't be consider "syntax").

[^1]: I'd personally be inclined to say that would be a problem with the testing approach rather than rstest, but I can definitely see why one would still want to support the use case. So I understand the concern.

la10736 commented 2 years ago

Yes, it could be possible to check it a run time but...

Not every type implement Eq or PartialEq
Every test run in a separate thread and implement a check need a shared R/W structure to implement it: this make independent tests not really isolated
Random input is just a trivial example but I can imagine function that return values based o some state : not a good test design but a valid use case

So my taste is implement it with:

Lexicographic expression matching and not runtime value matching in order to raise an explicit and a clear compile time error
Add an annotation #[allow_duplicate] to disable this check

ericcornelissen commented 2 years ago

Every test run in a separate thread and implement a check need a shared R/W structure to implement it: this make independent tests not really isolated

Hadn't considered this, I agree this severely limits the desirability of checking duplicates at runtime.

So my taste is implement it with:

Lexicographic expression matching and not runtime value matching in order to raise an explicit and a clear compile time error

Add an annotation #[allow_duplicate] to disable this check

Lexicographic expression matching sounds reasonable to me and I agree a #[allow_duplicate] annotation makes sense (the opposite, something like #[check_duplicates] makes less sense I think).

Let me add one more alternative to the discussion: what about some optional mode/tool for detecting duplicates that is entirely separate from using rstest normally. For example, this could be a "debug" mode of operation for detecting duplicates where (simplified:) tests are ran in one thread and duplicate values will be reported.

The reason I'm suggesting this is because I believe looking at values rather than lexicographic expressions will result in fewer false negatives (see details below). Admittedly this would come at the cost of convenience as it would require some extra command/flag/etc. to find duplicates...

Lexicographic expressions

negative positive

true 1 + 1 != 1 + 2 1 + 2 == 1 + 2

false 2 + 1 != 1 + 2 rand() == rand()*

*: it would always falsy report this. Even though two evaluations of rand() might return the same value, they should probably be considered unique in practice.
Values

negative positive

true 1 + 1 != 1 + 2 1 + 2 == 1 + 2

false rand() == rand()*** rand() == rand()**

**: it might falsy report this if rand() happens to return the same value twice. The likelihood of this would depend on the output distribution of rand(). ***: it might fail report if rand() has, say, 2 possible return values and happens to return both of them. While in practice this might be rare because rand() is much more likely to return one of the possible return values.

	negative	positive
true	`1 + 1 != 1 + 2`	`1 + 2 == 1 + 2`
false	`2 + 1 != 1 + 2`	`rand() == rand()`*

	negative	positive
true	`1 + 1 != 1 + 2`	`1 + 2 == 1 + 2`
false	`rand() == rand()`***	`rand() == rand()`**

Not every type implement Eq or PartialEq

Just for the record, I'd imagine such types should never be considered equal for the purposes of duplicate detection.

la10736 / rstest

Detect and error on duplicate test cases #161