jqwik-team / jqwik

Property-Based Testing on the JUnit Platform
http://jqwik.net
Eclipse Public License 2.0
578 stars 64 forks source link

chore: add github-actions-random-matrix to increase test coverage in CI #460

Closed vlsi closed 1 year ago

vlsi commented 1 year ago

Fixes https://github.com/jlink/jqwik/issues/350


I hereby agree to the terms of the jqwik Contributor Agreement.

vlsi commented 1 year ago

Looks like the jobs are properly generated. The remaining bit is to check if locale/timezone properties are passed to Test tasks properly.

vlsi commented 1 year ago

@jlink , it looks like the matrix uncovered some test issues:

https://github.com/jlink/jqwik/actions/runs/4135921237/jobs/7149048014#step:4:225

WDYT?

FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > monthRangeBetweenMarchAndJuly
    java.lang.AssertionError: 
    Expecting actual:
      FEBRUARY
    to be greater than or equal to:
      MARCH

        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.monthRangeBetweenMarchAndJuly(ConstraintTests.java:48)
FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > dayOfMonthRangeBetween15And20Integer
    java.lang.AssertionError: 
    Expecting actual:
      14
    to be greater than or equal to:
      15

        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.dayOfMonthRangeBetween15And20Integer(ConstraintTests.java:54)
AILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > dayOfWeekRangeOnlyMonday
    org.opentest4j.AssertionFailedError: 
    expected: MONDAY
     but was: SUNDAY
        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.dayOfWeekRangeOnlyMonday(ConstraintTests.java:60)
FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > dayOfWeekRangeOnlySunday
    org.opentest4j.AssertionFailedError: 
    expected: SUNDAY
     but was: SATURDAY
        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.dayOfWeekRangeOnlySunday(ConstraintTests.java:65)
FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > yearRangeBetween500And700
    java.lang.AssertionError: 
    Expecting actual:
      499
    to be greater than or equal to:
      500

        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.yearRangeBetween500And700(ConstraintTests.java:42)
FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > dayOfWeekRangeBetweenTuesdayAndFriday
    java.lang.AssertionError: 
    Expecting actual:
      MONDAY
    to be greater than or equal to:
      TUESDAY

        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.dayOfWeekRangeBetweenTuesdayAndFriday(ConstraintTests.java:71)
jlink commented 1 year ago

@jlink , it looks like the matrix uncovered some test issues:

https://github.com/jlink/jqwik/actions/runs/4135921237/jobs/7149048014#step:4:225

WDYT?

FAILURE   0,0sec, net.jqwik.time.api.dates.date.ConstraintTests$Constraints > monthRangeBetweenMarchAndJuly
    java.lang.AssertionError: 
    Expecting actual:
      FEBRUARY
    to be greater than or equal to:
      MARCH

        at net.jqwik.time.api.dates.date.ConstraintTests$Constraints.monthRangeBetweenMarchAndJuly(ConstraintTests.java:48)

Which leads to the question: How can I replicate a matrix-dependent bug locally? Or get enough information during the CI run to analyse the problem?

vlsi commented 1 year ago

How can I replicate a matrix-dependent bug locally?

See "Steps to reproduce" step: https://github.com/jlink/jqwik/actions/runs/4136772168/jobs/7151093679#step:4:18

Or get enough information during the CI run to analyse the problem?

There might be other variables (e.g. jqwik random seed), so it might be a good idea to improve jqwik failure messages to include seed or even a full command to reproduce the failure. However, that is somewhat out of the scope of matrix randomization.

vlsi commented 1 year ago

Looks like sameHash reproduces javac crash: https://github.com/jlink/jqwik/actions/runs/4142264704/jobs/7162736133#step:5:365

D:\a\jqwik\jqwik\documentation\src\test\java\net\jqwik\docs\DataDrivenPropertyExamples.java:13: error: incompatible types: inference variable T1#1 has incompatible bounds
        return Table.of(
                       ^
    equality constraints: Integer,T2#2,T2#1,T2#2,T2#2,T2#2,T1#2,T1#2,T1#2,T1#2
    lower bounds: Integer,String
  where T1#1,T2#1,T2#2,T1#2 are type-variables:
    T1#1 extends Object declared in method <T1#1,T2#1>of(Tuple2<T1#1,T2#1>...)
    T2#1 extends Object declared in method <T1#1,T2#1>of(Tuple2<T1#1,T2#1>...)
    T2#2 extends Object declared in method <T1#2,T2#2>of(T1#2,T2#2)
    T1#2 extends Object declared in method <T1#2,T2#2>of(T1#2,T2#2)
D:\a\jqwik\jqwik\documentation\src\test\java\net\jqwik\docs\state\mystore\MyStore.java:38: error: no suitable method found for collect(Collector<Object,CAP#1,Collection<Object>>)
        return tuples.stream().map(Tuple1::get1).collect(Collectors.toCollection(LinkedHashSet::new));
                                                ^
    method Stream.<R#1>collect(Supplier<R#1>,BiConsumer<R#1,? super K>,BiConsumer<R#1,R#1>) is not applicable
      (cannot infer type-variable(s) R#1
        (actual and formal argument lists differ in length))
    method Stream.<R#2,A>collect(Collector<? super K,A,R#2>) is not applicable
      (inference variable C has incompatible bounds
        equality constraints: Object,R#2,K
        lower bounds: Object,Collection<T#2>)
  where R#1,T#1,K,R#2,A,C,T#2 are type-variables:
    R#1 extends Object declared in method <R#1>collect(Supplier<R#1>,BiConsumer<R#1,? super T#1>,BiConsumer<R#1,R#1>)
    T#1 extends Object declared in interface Stream
    K extends Object declared in class MyStore
    R#2 extends Object declared in method <R#2,A>collect(Collector<? super T#1,A,R#2>)
    A extends Object declared in method <R#2,A>collect(Collector<? super T#1,A,R#2>)
    C extends Collection<T#2> declared in method <T#2,C>toCollection(Supplier<C>)
    T#2 extends Object declared in method <T#2,C>toCollection(Supplier<C>)
  where CAP#1 is a fresh type-variable:
    CAP#1 extends Object from capture of ?

WDYT?

Do you think you would prefer trying workaround in jqwik code? I guess adding explicit type parameters might help.

Do you think you would prefer ignoring the bug and pass -XX:hashCode=2 to the test tasks only?

jlink commented 1 year ago

How can I replicate a matrix-dependent bug locally?

See "Steps to reproduce" step: https://github.com/jlink/jqwik/actions/runs/4136772168/jobs/7151093679#step:4:18

This one can be reduced to using time zone "America/New_York". The approach clearly needs shrinking ;-)

vlsi commented 1 year ago

It might be extra fun to generate a series of CI jobs that execute only in case the original ones fail, and which effectively do a binary search over the set of extra added properties 🤯

An extra bonus would go for generating the shrinker sequence depending on the state of the previous jobs.

Of course, the jobs can't probably spawn in a for-loop, however, there might be a predefined sequence of "generate next parameters ShrinkStage1" (via matrix-shrink.js) "CI jobs of ShrinkStage1" "generate next parameters ShrinkStage2" (via matrix-shrink.js) "CI jobs of ShrinkStage2" "generate next parameters ShrinkStage3" (via matrix-shrink.js) "CI jobs of ShrinkStage3"

In that case, 3-4 stages should be enough to shrink up to 16 flags.

On the other hand, it is quite clear that failure dates.date.ConstraintTests$Constraints is probably caused by a timezone setting :)

jlink commented 1 year ago

Do you think you would prefer ignoring the bug and pass -XX:hashCode=2 to the test tasks only?

probably the easiest fix. I wonder if it's a java 18 problem only. If so, we could exclude 18 from the matrix - it's used so rarely. SDKMAN has no java 18 in its list anymore.

vlsi commented 1 year ago

bably the easiest fix. I wonder if it's a java 18 problem only

It might be https://bugs.openjdk.org/browse/JDK-8288590 If that is the case, then the bug is reproducible with OpenJDK 17+

jlink commented 1 year ago

If that is the case, then the bug is reproducible with OpenJDK 17+

It seems to depend on the exact version of 17. 17.0.4-amzn shows the problem 17.0.6-amzn does not.

jlink commented 1 year ago

@vlsi When do you think this PR could be ready for merging? I'd like to transfer the repo to jqwik-team org. It wouldn't break the PR, but you'd have to change upstream repo.

vlsi commented 1 year ago

It looks like there's hashcode javac issue left, and I think we'd need to reduce the number of jobs before merge :) I bumped the number of generated jobs just to see if they more-or-less work.

jlink commented 1 year ago

It looks like there's hashcode javac issue left,

With 17.0.4-amzn just having "-XX:+UnlockExperimentalVMOptions" suffices for reproducing the problem. And I haven't found a "hashCode=x" setting that gets rid of it.

vlsi commented 1 year ago

With 17.0.4-amzn just having "-XX:+UnlockExperimentalVMOptions" suffices for reproducing the problem.

That is interesting. I have not tried -XX:+UnlockExperimentalVMOptions alone.

vlsi commented 1 year ago

It seems to depend on the exact version of 17. 17.0.4-amzn shows the problem 17.0.6-amzn does not.

Then I suggest:

If you absolutely want testing with Java 18 (or 19, 20), I would recommend going for toolchains, so you can run Gradle with Java 17, and use 19 (or even 20-ea) for testing.

jlink commented 1 year ago

If you absolutely want testing with Java 18 (or 19, 20)

Well, the latest Java release should be tested.

vlsi commented 1 year ago

Well, the latest Java release should be tested.

None of the vendors treat 18 as supported. What do you mean by "latest Java"?

vlsi commented 1 year ago

@jlink , one more failure:

https://github.com/jlink/jqwik/actions/runs/4164699124/jobs/7206696255

Picked up _JAVA_OPTIONS: -Duser.country=RU -Duser.language=ru
net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage > randomStringsShouldContainZeroChar failure marker
FAILURE   0,0sec, net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage > randomStringsShouldContainZeroChar
    org.opentest4j.AssertionFailedError: Count of 10 for true does not fulfill condition for label "contains 0"
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.fail(StatisticsCollectorImpl.java:287)
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.failCondition(StatisticsCollectorImpl.java:283)
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.count(StatisticsCollectorImpl.java:232)
        at app//net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage.lambda$randomStringsShouldContainZeroChar$1(DefaultStringArbitraryTests.java:286)
vlsi commented 1 year ago

It seems to depend on the exact version of 17. 17.0.4-amzn shows the problem 17.0.6-amzn does not.

It looks like currently Gradle caches the outcome of compileJava task, and it does not really include _JAVA_OPTIONS into the cache key. That is why you might get confusing results like "a mere adding UnlockExperimentalVMOptions makes it fail".

Let me see if I can add _JAVA_OPTIONS into the default input of javaexec tasks.

jlink commented 1 year ago

@jlink , one more failure:

https://github.com/jlink/jqwik/actions/runs/4164699124/jobs/7206696255

Picked up _JAVA_OPTIONS: -Duser.country=RU -Duser.language=ru
net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage > randomStringsShouldContainZeroChar failure marker
FAILURE   0,0sec, net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage > randomStringsShouldContainZeroChar
    org.opentest4j.AssertionFailedError: Count of 10 for true does not fulfill condition for label "contains 0"
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.fail(StatisticsCollectorImpl.java:287)
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.failCondition(StatisticsCollectorImpl.java:283)
        at app//net.jqwik.engine.hooks.statistics.StatisticsCollectorImpl$CoverageCheckerImpl.count(StatisticsCollectorImpl.java:232)
        at app//net.jqwik.engine.properties.arbitraries.DefaultStringArbitraryTests$Coverage.lambda$randomStringsShouldContainZeroChar$1(DefaultStringArbitraryTests.java:286)

Probably a flaky test.

jlink commented 1 year ago

What do you mean by "latest Java"?

The latest release. Currently it's 19. Will be 20 in March and so on...

vlsi commented 1 year ago

The latest release. Currently it's 19. Will be 20 in March and so on...

It might be that Gradle that supports 20 a bit after 20 release, so it would make sense to go for toolchains for test purposes to decouple JVM for Gradle from JVM for tests.

vlsi commented 1 year ago

"same hashcode" defeats Java 8 compiler as well, so I see the following options:

a) Adjust code to prevent failures. For instance, add explicit type arguments. It looks like there are ~10 places or so. a) Pass "same hashcode" (e.g. -XX:hashCode=2) property to test tasks only. Frankly speaking, I do not want to figure out the way to do it with Groovy, so I would like avoid touching build.gradle files.

vlsi commented 1 year ago

Currently, I completely removed -XX:hashCode=2 from the test matrix, so it won't detect cases when the implementation silently relies on System.hashCode uniqueness.

vlsi commented 1 year ago

I think this is mergeable

jlink commented 1 year ago

a) Adjust code to prevent failures. For instance, add explicit type arguments. It looks like there are ~10 places or so.

I tried at some of the places and it didn't make the problem disappear. I don't understand why this should be necessary in the first place. The compiler should be able to figure it out. But I haven't dived too deeply into the topic.

jlink commented 1 year ago

@vlsi Thanks. Let's see what bugs it's going to detect.

vlsi commented 1 year ago

Kotlin compiler 1.8.0 is slow when executed with XX:hashCode=2, so I'm inclined "same hashcode" should be passed to tests only https://github.com/pgjdbc/pgjdbc/pull/2821#issuecomment-1436013284 🤷

vlsi commented 1 year ago

Here's the PR for pgjdbc: https://github.com/pgjdbc/pgjdbc/pull/2822

jlink commented 1 year ago

@vlsi I assume something in the matric creation does not work as expected: https://github.com/jqwik-team/jqwik/actions/runs/4281449836/jobs/7454493009

vlsi commented 1 year ago

Microsoft does not release Java 19, and they do not support Java 19, so you can't test with Microsoft and Java 19 at the same time.

Here are the exclusions:

https://github.com/jqwik-team/jqwik/blob/42116ed7448fd03e8225123e104c46066534bbfb/.github/workflows/matrix.js#L95-L97

I suggest adding 19 to the list.