Closed dweiss closed 4 years ago
Comment by Bruno Roustant (@bruno-roustant) (migrated from JIRA)
Sample code https://github.com/carrotsearch/hppc/pull/10
Comment by Bruno Roustant (@bruno-roustant) (migrated from JIRA)
Another option is to have this (small) slot increment generated randomly per instance. That way it does not impact hash performance but is more random than a fixed increment.
The goal of KTypeHashSet.keyMixer is to prevent collision avalanches as described in https://issues.carrot2.org/browse/HPPC-80 and https://issues.carrot2.org/browse/HPPC-103
It comes at the cost of an out-of-the-box performance penalty. However the Scatter version does not have this mixer nor penalty. But it means doubling the number of generated classes to have both standard and scatter versions. With a potential addition of Worm hashing, it again doubles the number of classes for Worm with a mixed and scatter version.
I (re)open the discussion to see if the keyMixer version approach could be simply replaced by a different way to iterate the keys. It would reduce significantly the size of the hppc jar. A first approach was taken in https://github.com/vsonnier/hppc/issues/18 but I tried a variant with a simple extension of toArray() to iterate keys with a small prime slot increment (e.g. 29 or 7) masked on the array length which provides nice perf on the CollisionAvalanche benchmark (Scatter version faster than standard one). It could be applied to the toArray(), iterator(), forEach(), addAll() methods.
Are there some other cases that I missed where the keyMixer makes batch addition safe, that the protected iteration order cannot handle?
Issue: HPPC-186 (migrated from JIRA), created by Bruno Roustant (@bruno-roustant), resolved Aug 21 2020 Linked issues:
146