hector-client / hector

a high level client for cassandra
http://prettyprint.me/2010/02/23/hector-a-java-cassandra-client/
MIT License
644 stars 299 forks source link

Make LeastActiveBalancingPolicy.ShufflingCompare comparisons stable #646

Closed jancona closed 10 years ago

jancona commented 10 years ago

This avoids a problem where mutable state in ConcurrentHClientPool causes

IllegalArgumentException: Comparison method violates its general contract!

failures when running in Java 7.

The default sort algorithm changed in Java 7 (see this blog post for details and links). The new TimSort algorithm is pickier about the Comparable and Comparator honoring their contract. It appears that because ConcurrentHClientPool. getNumActive can return different values while the sort is going on, that can trigger the error.

We've ran into this issue immediately after deploying one of our apps in production on Java 7. We have worked around the issue by setting the system property java.util.Arrays.useLegacyMergeSort=true. We did not see the issue in our pre-production environments. Looking at the Java 7 source the new algorithm is only used when the list to be sorted is 32 or more elements. Our production cluster has 36 hosts in the list, while our pre-production clusters are much smaller.

I've included a test which shows the error when run agains the old ShufflingCompare implementation in Java 7. It passes with this new implementation or with the old one under Java 6.

My solution was to cache the compared values for the duration of the sort. If someone has a better idea, I'm all ears.

I'd like to see this backported to the 1.0 branch if possible. The commit should apply cleanly.