cucumber / cucumber-jvm

Cucumber for the JVM
https://cucumber.io
MIT License
2.7k stars 2.02k forks source link

IncrementingUuidGeneratorTest fails intermittently #2851

Closed mpkorstanje closed 7 months ago

mpkorstanje commented 7 months ago

🤔 What's the problem you've observed?

Build 7974811136 failed with:

[INFO] Running io.cucumber.core.eventbus.IncrementingUuidGeneratorTest
[ERROR] Tests run: 5, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.047 s <<< FAILURE! -- in io.cucumber.core.eventbus.IncrementingUuidGeneratorTest
[ERROR] io.cucumber.core.eventbus.IncrementingUuidGeneratorTest.different_classloaders_generators -- Time elapsed: 0.016 s <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <10> but was: <9>
    at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
    at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
    at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
    at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166)
    at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161)
    at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:632)
    at io.cucumber.core.eventbus.IncrementingUuidGeneratorTest.checkUuidProperties(IncrementingUuidGeneratorTest.java:144)
    at io.cucumber.core.eventbus.IncrementingUuidGeneratorTest.different_classloaders_generators(IncrementingUuidGeneratorTest.java:99)
    at java.base/java.lang.reflect.Method.invoke(Method.java:580)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)

📚 Any additional context?

The IncrementingUuidGeneratorTest was added https://github.com/cucumber/cucumber-jvm/pull/2703.

https://github.com/cucumber/cucumber-jvm/blob/edf09f1fbabf0a9bb2cb0d3abab59eb52875585f/cucumber-core/src/test/java/io/cucumber/core/eventbus/IncrementingUuidGeneratorTest.java#L92-L99

https://github.com/cucumber/cucumber-jvm/blob/edf09f1fbabf0a9bb2cb0d3abab59eb52875585f/cucumber-core/src/test/java/io/cucumber/core/eventbus/IncrementingUuidGeneratorTest.java#L163-L172

So I'm guessing something is either racing, or not being marked volatile correctly here:

https://github.com/cucumber/cucumber-jvm/blob/edf09f1fbabf0a9bb2cb0d3abab59eb52875585f/cucumber-core/src/main/java/io/cucumber/core/eventbus/IncrementingUuidGenerator.java#L49-L56

mpkorstanje commented 7 months ago

@jkronegg do you fancy a look?

jkronegg commented 7 months ago

The IncrementingUuidGenerator is almost collision-free in different classloaders / JVMs because of the CLASSLOADER_ID property being initialized with a random number. I think we had the back luck that two classloaders have generated the same random number (or at least the 12 lower bits of the random number). I'll think to a replacement (maybe combining random number and sequence number).

jkronegg commented 7 months ago

I confirm this is caused by the CLASSLOADER_ID property. As one part of the UUID is comes from the classloader and is generated as a random number, UUID collisions may occurs (this behavior is known since the design time).

I reproduced the issue by generating UUIDs with multiple classloaders, in order to determine the collision probability depending on the number of classloaders. When there are 10 classloaders (as in the different_classloaders_generators test case), the probability of collision is 0.6% (basically, the build will fails every ~200 runs). Of course the probability increases with the number of classloaders, so that there is a 99% UUID collision probability when there are about 200 classloaders.

Note that the cumulated histogram is a bit misleading: it gives the collision rate for the CLASSLOADER_ID property, but of course the collision rate on UUID is much lower due to the other UUID parts (epoch time and counters).

classloader_collision

I can make the test-case more flexible by checking that the UUID collision rate is lower than a threshold (lets say 1% when there are 10 classloaders), rather to be collision-free as in the current implementation. This would better correspond to the almost collision-free in different classloaders / JVMs statement from the IncrementingUuidGenerator javadoc. And also the documentation must be updated to give some figures about collision rate.

Something like this:

/**
 * Check that collision rate is lower than a given threshold when using multiple classloaders.
 * Note: this test takes about 20 seconds.
 */
@Test
void collision_rate_lower_than_two_percents_with_ten_classloaders() throws NoSuchFieldException, IllegalAccessException {
    // When I compute the UUID collision with multiple classloaders
    Set<Long> classloaderIds = new HashSet<>();
    List<Integer> stats = new ArrayList<>();
    while (stats.size()<100) {
        if (!classloaderIds.add(getStaticFieldValue(getUuidGeneratorFromOtherClassloader(), "CLASSLOADER_ID"))) {
            stats.add(classloaderIds.size()+1);
            classloaderIds.clear();
        }
    }

    // Then the collision probability for 10 classloaders is less than 2%
    double collisionProbabilityWhenUsingTenClassloaders = stats.stream()
            .filter(x -> x<10).count() * 100 / (double) stats.size();
    assertTrue(collisionProbabilityWhenUsingTenClassloaders <= 2);
}

@mpkorstanje what do you think ? (also if you have an idea of a better classloader "quasi-unique" identifier on 12 bits, I'll buy it)

jkronegg commented 7 months ago

I made some tests to generate a better CLASSLOADER_ID (epoch-time, classloader hash code, combinations of methods), but the current one (random) gives always the lowest CLASSLOADER_ID collision rate.

One option is to provide a static method setClassloaderId(int) which can be used to define a custom CLASSLOADER_ID. If the developer uses multiple classloaders, he may control the classloader creation and could define an unique identifier for the classloader (e.g. using a counter). This way, the 12 bits of the CLASSLOADER_ID gives 4096 possibilities, so it's only at the 4097th classloader that he would have collision on the CLASSLOADER_ID.

I implemented this option, and it works well (the test-case in my previous comment gives 0% collision after 100 trials, but of course must run much longer because it takes 100 x 4097 classloaders to complete the test).

I'll provide a PR which corrects the failing test-case and improve the IncrementingUuidGenerator behavior when using multiple classloaders.

mpkorstanje commented 7 months ago

Cheers! I'll get back to this. My to-do list just blew up.

jkronegg commented 7 months ago

Ensuring that CLASSLOADER_ID has a collision rate lower than 2% for 10 classloaders is not sufficient:

[ERROR] io.cucumber.core.eventbus.IncrementingUuidGeneratorTest.classloaderid_collision_rate_lower_than_two_percents_with_ten_classloaders -- Time elapsed: 1.[644](https://github.com/cucumber/cucumber-jvm/actions/runs/8005337466/job/21864531892?pr=2853#step:5:645) s <<< FAILURE!
  org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
      at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
      at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
      at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
      at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
      at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
      at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)
      at io.cucumber.core.eventbus.IncrementingUuidGeneratorTest.classloaderid_collision_rate_lower_than_two_percents_with_ten_classloaders(IncrementingUuidGeneratorTest.java:204)
      at java.base/java.lang.reflect.Method.invoke(Method.java:580)
      at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)
      at java.base/java.util.ArrayList.forEach(ArrayList.java:1596)

My guess is that, since it's random numbers, you have a "statistical guarantee" to have 1-2% but occasionally (bad luck), you could have a lot of collisions. So the two solutions I see are: increasing the threshold (let's say 5%) or increase the sample size (e.g. 200, but this will increase the test duration).

mpkorstanje commented 7 months ago

How about a retry (max 100) until passed?