Closed esfomeado closed 1 year ago
@esfomeado Thanks for raising that issue.
I see why you‘d expect that repeatChars(0)
to not create any duplicate chars at all. Currently it just sets the probability of explicitly adding duplicate chars to 0. The effect you see is due to edge case resolution.
I’ll have to think about if changing the behaviour or documentation of repeatChars
is the right thing to do. Maybe an additional uniqueChars()
would be even clearer.
To solve your problem short-term, you can use [withoutEdgeCases()](https://jqwik.net/docs/1.7.4/javadoc/net/jqwik/api/Arbitrary.html#withoutEdgeCases()) on the strings arbitrary:
Arbitraries.strings().alpha()
.ofMinLength(5).ofMaxLength(25)
.repeatChars(0.0)
.withoutEdgeCases();
This will prevent „AAAAA“ to be generated on purpose. Duplicate chars will still occur by chance though.
If you definitely want no duplicates, you’d currently have to take a detour through something like:
Arbitraries.chars().alpha().list()
.ofMinSize(5).ofMaxSize(25)
.uniqueElements()
.map(chars -> chars.stream().map(String::valueOf).collect(Collectors.joining()));
I decided to introduce an experimental new API method: StringArbitrary.uniqueChars()
The open question is if StringArbitrary.repeatChars(0.0)
should log a warning or just call uniqueChars
implicitly, or both?
@esfomeado What do you think?
Thanks for the quick response.
@jlink Personally I would expect to have the same behaviour was uniqueChars
when set to 0 so I would say to just call uniqueChars
implicitly.
The StringArbitrary.uniqueChars()
functionality and the fix for repeatChars(0.0)
is available in 1.8.0-SNAPSHOT
.
In addition, there's now an annotation @UniqueChars
that can be applied to any String
parameter in a property.
New features available in 1.8.0-SNAPSHOT
Testing Problem
I had an issue where jqwik is generating duplicated values frequently which, for my example should almost never occur.
I have noticed that these duplicated values are always the same character e,g,
AAAAAA
so I decided to set therepeatedChars
to 0 but this type of values are still generated. I have tried the other values but seems that there is no impact on the ammount of time a repeated char is used.