Closed marcphilipp closed 2 years ago
I think that sounds reasonable in terms of a new feature.
The hardest part will be coming up with good names for the annotations. 😉
Maybe, instead of introducing a bunch of new annotations, a simple API to combine sets of arguments from a @MethodSource
method would be enough.
Something like
static Collection<Arguments> cartesianProduct() {
return Arguments.builder(Set.of("foo", "bar"))
.combineWith(Set.of(23, 42))
.build();
}
or
static Collection<Arguments> cartesianProduct() {
return Arguments.cartesianProduct(Set.of("foo", "bar"), Set.of(23, 42));
}
True, that may prove more elegant, and it definitely avoids an explosion of annotations (which we should aim to avoid).
So I agree: let's go with a programmatic approach via @MethodSource
first and see what people think of that!
Though I would have Arguments.cartesianProduct()
return a Set
instead of a Collection
, since it must be a set by definition -- right? 🤔
Well, a Set
is a Collection
... 😉
While, mathematically speaking, it must be a set, I think we should be more lenient than that. I think we should make it work with arbitrary Collections
and just return a Collection
or Iterable
. So, for instance, if one passes a List
including duplicates (according to equals()
), we would include the duplicates in the resulting Collection
. Does that make sense?
Ummmm.... no, not really. 😇
My point was that the framework should actually require/enforce set semantics and avoid duplicated invocations of equivalent products.
Does that make sense? 😉
I think that's debatable because of potentially broken/slow equals
implementations... 🤔
Hmmmmmmmmmmmmm
I can't see your method signature for Arguments.cartesianProduct()
.
Are you proposing the following?
Collection<Arguments> Arguments.cartesianProduct(Collection<Object>... collections);
... b/c I was thinking of this:
Set<Arguments> Arguments.cartesianProduct(Set<Object>... sets);
I think that's debatable because of potentially broken/slow
equals
implementations.
I'm of course not referring to built-in Java types, but user-defined ones. E.g. I've often encountered JPA entities where equals()
only compared their id
field. Or, the best ever equals()
implementation I've stumbled upon which unconditionally returned false
. Of course, that's nonsense and people shouldn't do that. On the other hand, we allow duplicate invocations based on the Stream
/Iterable
instances returned from @MethodSource
referenced methods so I'm not convinced we should be this strict in this case.
I can't see your method signature for
Arguments.cartesianProduct()
.
Almost! I was thinking of this:
Collection<Arguments> Arguments.cartesianProduct(Collection<?>... collections);
The builder may be an alternative that let's one configure whether or not duplicates should be removed.
I knew what you meant regarding equals/hashCode. I was just confirming the overall signature. 😉
Oops... yeah... when I typed <Object>
I of course meant <?>
.
On the other hand, we allow duplicate invocations based on the
Stream
/Iterable
instances returned from@MethodSource
referenced methods
That's true, but... those duplicates would be returned by user code; whereas, Arguments.cartesianProduct()
would be framework code that can be more exact.
In any case, I see your point regarding broken equals/hashCode implementations in user code. Hence, my follow-up comment below.
The builder may be an alternative that let's one configure whether or not duplicates should be removed.
I think that would almost be a requirement to support that even if only as an opt-in feature in the builder.
the best ever
equals()
implementation I've stumbled upon which unconditionally returnedfalse
.
That's quite a jewel of software engineering, but... I've seen something that tops it (on another level): a method in the service layer annotated with @Ignore
from JUnit 4. I'll just leave it at that. 😇
A couple of random (hopefully useful) notes from a JUnit user:
ArgumentsSource
to use annotations if they really wanted it--just throwing out a couple of opinions! 😄)void testAddTwoNumbers(int first, int second)
. I find that this is a common use case for reducers and other pieces of code that need to combine two values (or more). This ended up being a limiting factor for @ValueSource
in a lot of our testing.I believe that Section 3.15 was actually intended to be section 3.14.2
Good catch!
Fixed in 5343323db975f1f1469ea6e7601205f94dc0ab81.
For the programmatic approach, I would suggest that the method should return a lazy Stream<Arguments>
instead of an already materialized collection.
I had a similar requirement some time ago. There was a huge number of input combinations generated by a cartesian product, but only a few of them were actually valid combinations for the test scenarios. An example would be that for the german version of a webshop, only certain payment methods and only a shipping address in germany would be valid.
Having this method returning a stream makes it easier to reuse it by filtering elements, without keeping all elements in memory first. I can provide an implementation if needed.
I seem to recall hearing/reading something about performance issues when attempting to do something like building up a Cartesian product using Java streams.
So if we truly implement it solely based on streams instead of building a collection upfront, we would then likely need to keep performance in mind in order to support large data sets.
I can provide an implementation if needed.
@jhorstmann, perhaps just sharing your ideas/intentions would be a good start.
I seem to recall hearing/reading something about performance issues when attempting to do something like building up a Cartesian product using Java streams.
@sbrannen Guava's Sets.cartesianProduct
is relatively lazy by my understanding, so I imagine it wouldn't be impossible to make a (sequential) streams version which is at least as lazy and performant. But I admit that I've not personally read anything about the performance of stream-based Cartesian sets, so I may be speaking nonsense... :)
Sure. I'm not saying it's not possible. I'm just saying that I assume it will be considerably more complicated (in terms of the actual implementation) if we accept multiple streams as input and stream the Cartesian product on the fly back to the Jupiter TestEngine for processing. 😉
Based on internal team discussions, we are currently leaning toward one of the following programmatic APIs:
List<Arguments> Arguments.cartesianProduct(Set<?>... sets);
Or:
Stream<Arguments> Arguments.cartesianProduct(Set<?>... sets);
In other words, we aim to accept sets as input and generate the Cartesian product as an ordered collection (e.g., a List
) or Stream
of Argument
tuples.
Having this method returning a stream makes it easier to reuse it by filtering elements, without keeping all elements in memory first.
That sounds like a reasonable argument to me in favor of streaming results: I had not considered that a user might want to filter the Cartesian product tuples before returning them to the Jupiter TestEngine.
@jbduncan, regarding possible performance issues, I think this blog is what I was recalling (or similar at least).
Quoting that blog:
this code computes the entire stream of cartesian product values before processing the first value.
I haven't verified the claim of course, but it at least sounds like the author put a lot of consideration into the presented solution.
p.s. FWIW, most solutions on Stack Overflow and similar channels suggest the use of flatMap()
but don't mention any issues with performance.
@sbrannen Cool, thank you for pointing me in the direction of that blog. I gave it a scan yesterday, and it looks promising to me!
Sure. I'm not saying it's not possible. I'm just saying that I assume it will be considerably more complicated (in terms of the actual implementation) if we accept multiple streams as input and stream the Cartesian product on the fly back to the Jupiter TestEngine for processing. :wink:
@sbrannen Yeah, okay. Sorry, I wasn't terribly clear earlier: I was sub-consciously hoping that we could easily derive an implementation like Guava's Sets.cartesianProduct
for this issue without needing to use the full-blown spliterator-based solution given in that blog, as I vaguely recalled that Sets.cartesianProduct
's javadoc gave a citation for the algorithm its implementation was based on. But looking back at said javadoc, I realise that I was mistaken!
Thus, using the blog you linked earlier sounds like the best way forward to me.
P.S. FYI, when I was scanning through the blog, I actually recognised the JDK bug report that the author opened regarding flatMap()
, for I saw a similar one being discussed on the OpenJDK mailing lists not too long ago. It turns out it's actually a duplicate of this bug, which is fixed already but only for Java 10+.
If the JUnit 5 team does use that blog as inspiration for resolving this issue, might it be worth opening another issue to test out the simpler flatMap()
-based solution in the future if or when JUnit 5+ depends on Java 10+?
For reference, my implementation is available at https://gist.github.com/jhorstmann/a7aba9947bc4926a75f6de8f69560c6e and actually looks very similar to the blog post, except it uses an iterator which is then wrapped into a spliterator. I can add any license if needed.
The cartesianProduct
method returning Stream<Arguments>
could already be implemented outside of junit. I'm thinking if we could make it even easier to use. Especially with a larger number of dimensions, mapping those to method arguments by index could be error prone. Especially if you'd want different subsets of dimensions for different test methods. Another approach could be to allow the different Source
annotations on method parameters
void test(@CsvFileSource(...) @CsvToPerson Person person,
@EnumSource(...) PaymentMethod paymentMethod,
@MethodSource(...) Address shippingAddress,
@MethodSource(...) Address billingAddress)
If I understand it correctly, the above annotations could also be used as meta-annotations, making the example even nicer. Different test methods could so easily use different sets of dimensions. Filtering of valid combinations could be done inside the method by using Assumptions
.
Interesting idea! However, it wouldn't support multiple parameters per source at all. For starters I think we should define the cartesianProduct
method in Arguments
.
@jhorstmann Would you like to submit a PR with your implementation?
Is there some advances on that matter? I am just looking for such a feature, actually.
Is there some advances on that matter? I am just looking for such a feature, actually.
Nothing official, but if I recall correctly, @sormuras created a proof of concept.
@sormuras ?
Aye. The POC wants to be integrated into JUnit Pioneer. See https://github.com/junit-pioneer/junit-pioneer/issues/68 for details -- which also links back to the current POC within our JUnit 5 Jupiter Extension Samples project.
Would be cool if s/o took over to make the integration/implementation happen.
I think a simple cartesianProduct
method would be easy to implement and provide a way to implement such use cases along with @MethodSource
without requiring any additional dependencies.
For example, I needed sth. like that here: https://github.com/marcphilipp/nexus-publish-plugin/blob/master/src/test/kotlin/de/marcphilipp/gradle/nexus/NexusPublishPluginTests.kt#L55-L61
I have a professional case (sorry I can't share it here) where I have many tests, including a bunch to be parametrized. These parameters build on the same few sets, but they are often different combinations (different size and order) of those sets. Rather than using a different function for each of them, I would rather provide the few annotations each of them needs, with the right sets in the right order. It would be way better to read than a double method every time.
@matthieu-vergne Could you please explain in more detail what you mean with "a double method every time"?
The test method + the argument provider method.
Based on internal team discussions, we are currently leaning toward one of the following programmatic APIs:
List<Arguments> Arguments.cartesianProduct(Set<?>... sets);
Or:
Stream<Arguments> Arguments.cartesianProduct(Set<?>... sets);
In other words, we aim to accept sets as input and generate the Cartesian product as an ordered collection (e.g., a
List
) orStream
ofArgument
tuples.
@sbrannen
I like this approach, but would prefer cartesianProduct(Collection<?> ... collections) as suggested by @marcphilipp previously. The reason being that when writing tests in Groovy, we'd have to do:
[false, true] as Set
etc. for each parameter. With Collection we won't have to add the as Set every time.
I'd be okay with a more loose definition based on Collection
if it doesn't make the implementation more complex.
If you can go this way, an Iterable
may be enough.
Is this new feature experimental?
I didn't see it in this list:
https://junit.org/junit5/docs/current/user-guide/#api-evolution-experimental-apis
No, that's just a sample project that shows how to accomplish this in an extension. It's not an officially supported feature (yet) since it's not as extensible as we'd like it to be.
Citing myself here:
Aye. The POC wants to be integrated into JUnit Pioneer. See junit-pioneer/junit-pioneer#68 for details -- which also links back to the current POC within our JUnit 5 Jupiter Extension Samples project.
Would be cool if s/o took over to make the integration/implementation happen.
It's still up-for-grabs. ;-)
@marcphilipp Thanks for the clarification.
Citing myself here:
Aye. The POC wants to be integrated into JUnit Pioneer. See junit-pioneer/junit-pioneer#68 for details -- which also links back to the current POC within our JUnit 5 Jupiter Extension Samples project. Would be cool if s/o took over to make the integration/implementation happen.
It's still up-for-grabs. ;-)
It seems it was not yet advertised: @Michael1993 brought the feature to life with junit-pioneer/junit-pioneer#321, and I'm proposing some additions in junit-pioneer/junit-pioneer#409 and junit-pioneer/junit-pioneer#416
This issue has been automatically marked as stale because it has not had recent activity. Given the limited bandwidth of the team, it will be automatically closed if no further activity occurs. Thank you for your contribution.
This issue has been automatically closed due to inactivity. If you have a good use case for this feature, please feel free to reopen the issue.
are there plans to integrate this into JUnit 5 itself?
There are no current plans but we may reconsider if the Pioneer extension gets widely used.
using cartesian test and saved so much complexity
See this StackOverflow post
@marcphilipp the extension by Pioneer is very nice. I would like to use it with our IDEA/Gradle integration test (test a number of scenarios with a number of Gradle versions), but it does not play well with JUnit5's ValueSources
and cannot provide a set of {Gradle versions} x {CSV values}
@nskvortsov when using the Pioneer extension, you're supposed to use the @CartesianTest
flavor of sources like @Values
, and not the JUnit ones.
For example, from their docs:
@CartesianTest
void myCartesianTestMethod(
@Values(ints = { 1, 2 }) int x,
@Values(ints = { 3, 4 }) int y) {
// passing test code
}
To the best of my knowledge, there is no support for CSV sources yet.
In the past, I raised junit-pioneer/junit-pioneer#416 (targeting the "old" version of the extension) but haven't had the capacity to bring the topic forward. You could chime in with some details about your use case.
This issue originates from a dicussion in https://github.com/junit-team/junit5/pull/1422#issuecomment-390582297.
While often it suffices to specify each set of arguments for the invocation of a parameterized test explicitly, there are cases where it's simply the combination of lists of potential values for each parameter, i.e. the cartesian product of these collections. While it's certainly already possible to compute these combinations manually, e.g. in a method referenced by
@MethodSource
, it would be more convenient if there were a declarative way to specify that.Recently, @sormuras has written a sample
CartesianProductProvider
extension which does that by implementingTestTemplateInvocationContextProvider
. However, by implementing this functionality as aArgumentsProvider
users would not have to learn a new concept and we could reuse the existing infrastructure.One possibility would be to use fields and methods, like the
Theories
runner from JUnit 4.@junit-team/junit-lambda Thoughts?
Deliverables
ArgumentsProvider
and related annotations