Closed martyn0ff closed 10 months ago
Why do you have .map that creates random velues? That is not reproducible, and I guess it is the reason for misbehavior. Try removing .map
@vlsi This was just a simplified illustration of a problem. I have some random data generator that randomly reads lines from several text files and constructs some kind of an object using these lines as an input. Say I want to generate a criteria for SQL query, but I only want values in a criteria that I know don't exist in database. What I would do is cache all existing values first then, run my homegrown generator with jqwik have it exclude such randomly generated values that don't exist in a cache and have them injected in a test.
As @vlsi wrote, it’s not clear why you‘re trying to inject self-made random values into an already random process. Since this is not reproducible it does not go along with what jqwik is doing. If you really need Random instances for creating your arbitrary values, use Arbitraries.randoms().map(..)
and go from there.
But I assume that, whatever you want to achieve with your approach, would be approached better with a different overall strategy.
@vlsi This was just a simplified illustration of a problem. I have some random data generator that randomly reads lines from several text files and constructs some kind of an object using these lines as an input. Say I want to generate a criteria for SQL query, but I only want values in a criteria that I know don't exist in database. What I would do is cache all existing values first then, run my homegrown generator with jqwik have it exclude such randomly generated values that don't exist in a cache and have them injected in a test.
I’m not fully understanding your goal, but what I think I get is that you want to generated random values that depend on other, previously generated random values. This calls for one of three approaches: flat mapping, flat combining or state-based testing (in order of complexity). To decide which one is best suited for your problem requires a deeper dive into your domain.
@jlink Sorry, the test I gave initially maybe wasn't the best illustration of wht I'm doing. I've tried to be a little more concrete:
public class SampleTest {
Set<String> dontProvide = Set.of("a", "b");
Supplier<String> gen = () -> {
Random random = new Random();
int anInt = random.nextInt(6);
if (anInt == 1) return "a";
if (anInt == 2) return "b";
if (anInt == 3) return "c";
if (anInt == 4) return "d";
if (anInt == 5) return "e";
return "f";
};
@Provide
Arbitrary<String> foo() {
return Arbitraries
.integers()
.between(1, 100)
.map(anInt -> gen.get())
.filter(s -> !dontProvide.contains(s));
}
@Property
void foo(@ForAll("foo") String foo) {
assertThat(dontProvide).doesNotContain(foo);
}
}
Now I'm even more confused, because this test does not fail. How come? Aren't those tests similar?
I'm not going crazy after all. Check this out:
public class SampleTest {
@Provide
Arbitrary<Integer> failing_even() {
Random random = new Random();
return Arbitraries.integers()
.between(1, 1001) // range >= 1000
.map(anInt -> random.nextInt())
.filter(randomInt -> randomInt % 2 == 0);
}
@Provide
Arbitrary<Integer> passing_even() {
Random random = new Random();
return Arbitraries.integers()
.between(1, 1000) // range < 1000
.map(anInt -> random.nextInt())
.filter(randomInt -> randomInt % 2 == 0);
}
@Property
void failing_even(@ForAll("failing_even") int subject) {
assertThat(subject).isEven();
}
@Property
void passing_even(@ForAll("passing_even") int subject) {
assertThat(subject).isEven();
}
}
This is what I've noticed: as soon as a range of generated integers exceeds 1000
, filter misbehaves. This is the default number of tries jqwik does. If we bump a number of tries for failing_even()
such that it is greater than the range, then it works again as expected. Is this expected behavior?
The reason that > 1000 tries makes a difference is that jqwik will go for an exhaustive test (trying all possible values instead of randomized values) if it can calculate that there are at most as many values as tries for a property. You can specify the generation mode explicitly:
@Property(generation = GenerationMode.RANDOMIZED)
void even_now_also_fails (@ForAll("passing_even") int subject) {
assertThat(subject).isEven();
}
That explains the difference in behaviour but not why the randomized version fails. The reason for that is a bit more subtle and related to an implementation hack of jqwik: generator instances are cached depending on the values they depend upon. Without this hack, some value generation would be much much slower. In normal circumstances this caching works reasonably well and most edge cases have been covered meanwhile. Adding external randomness into the generation mechanism sadly breaks some of the assumptions behind caching. You could call that a bug, but it's a bug that cannot be easily fixed IMO.
To make a long story short: Always use jqwik's randomness for value generation. How that would look in your case I cannot say without more information. To build on your first example which randomly selects a value from a set of strings you'd use
@Provide
Arbitrary<String> foo() {
var valuesToChooseFrom = Set.of("a", "b", "c", "d", "e");
return Arbitraries.oneOf(valuesToChooseFrom)
.filter(s -> !dontProvide.contains(s));
}
valuesToChooseFrom
could as well be read from a file or be computed from the contents of a database.
@jlink Thanks for great tips and explanations. Letting jqwik manage randomness makes a ton of sense and I was able to find a satisfactory solution. My generator used a Random
instance that it shared among its various methods. First I've attempted to kick off generation with randomValue(random -> new MyGenerator(random))
followed by mapping resulting generator instance into values that I wanted. Ultimately it led to the same problem of filter(...)
doing its job but test method ending up with incorrect values. I've redesigned my generator such that each of its methods takes an instance of Random
instead and now randomValue(...)
cooperates.
I think I never considered the case of randomValue(random -> ...)
when implementing generator caching. Or I just forgot.
BTW, generator caching is not the culprit in this case. Most probably it's the lazy evaluation mechanism of filtering / shrinkables. So I might be able to fix it for special cases but currently I'd say it's not worth the effort since randomisation from the outside comes with so many other problems.
Testing Problem
I'm confused with creating an
Arbitrary
. Below is the simplified case of what I'm trying to achieve:What ends up happening is that
subject
integer and an integer logged infilter()
call are different. I don't get what I'm doing wrong and can't figure it out. Appreciate any help!