a problem with the computation of our hashkeys generated very frequent collisions, and thus bigger buckets. We then had an exception using UnifiedSet.retainAll().
I had a look at the code and found the problem: if i>3 then ChainedBucket.remove(i) calls removeLongChain(i-3), but the switch statement in removeLongChain throws an exception if i-3 > 3. Removing an object in the third chained bucket is thus not possible. It surely did not occur until now, because such heavy collisions are unlikely to happen.
Here is a "minimal change" fix:
private void removeLongChain(ChainedBucket oldBucket, int i)
{
do
{
ChainedBucket bucket = (ChainedBucket) oldBucket.three;
switch (i)
{
case 0:
bucket.zero = bucket.removeLast(0);
return;
case 1:
bucket.one = bucket.removeLast(1);
return;
case 2:
bucket.two = bucket.removeLast(2);
return;
case 3:
if (bucket.three instanceof ChainedBucket)
{
i -= 3;
oldBucket = bucket;
continue;
}
bucket.three = null;
return;
default:
// new code starts here (similar to case 3)
if (bucket.three instanceof ChainedBucket)
{
i -= 3;
oldBucket = bucket;
continue;
}
// new code ends here
throw new AssertionError();
}
}
while (true);
}
Here is a small exemple that reproduces the problem (using some of the strings wth colliding hash from issue #26):
package de.derivo;
import com.gs.collections.impl.set.mutable.UnifiedSet;
import java.util.Set;
public class Main {
static final String[] COLLISIONS = new String[] {
"\u9103\ufffe",
"\u9104\uffdf",
"\u9105\uffc0",
"\u9106\uffa1",
"\u9107\uff82",
"\u9108\uff63",
"\u9109\uff44",
"\u910a\uff25",
"\u910b\uff06",
"\u910c\ufee7"
};
final static int COUNT = 9;
public static void main(String[] args) {
Set<String> stringSet = new UnifiedSet<>();
Set<String> stringSet2 = new UnifiedSet<>();
for (int i = 0; i < COUNT; i++) {
stringSet.add(COLLISIONS[i]);
}
for (int i = 0; i < COUNT-2; i++) {
stringSet2.add(COLLISIONS[i]);
}
stringSet.retainAll(stringSet2);
}
}
Thanks for the great collections, I hope this helps!
Hi there,
a problem with the computation of our hashkeys generated very frequent collisions, and thus bigger buckets. We then had an exception using UnifiedSet.retainAll().
I had a look at the code and found the problem: if
i>3
thenChainedBucket.remove(i)
callsremoveLongChain(i-3)
, but the switch statement inremoveLongChain
throws an exception ifi-3 > 3
. Removing an object in the third chained bucket is thus not possible. It surely did not occur until now, because such heavy collisions are unlikely to happen.Here is a "minimal change" fix:
Here is a small exemple that reproduces the problem (using some of the strings wth colliding hash from issue #26):
Thanks for the great collections, I hope this helps!
Vincent.