Open plevart opened 3 years ago
Hi, @plevart
Glad to receive your benchmark for FastByteComparison
, which is very detailed! Yes, the extra allocation in the current implementation is seemingly creating an overhead when sorting the byte array. I think the asReadOnlyByteBuffer()
is the alternative solution to this problem. Would you like to submit a Pull Request for this issue?
Hi, While making changes to allow compilation / running with JDK9+, I noticed there is a utility class
FastByteComparisons
with static methods to compare twobyte[]
arrays or segments of arrays. It attempts to useUnsafe
to speed-up comparison by retrieving and comparinglong
(i.e. 8 byte) values at a time. IfUnsafe
is not available it falls back to a slower pure-java version which retrieves and compares individualbyte
values. This is all nice for comparingbyte[]
arrays when arrays are already the values kept by application. But the tikv java client also operates with protobufByteString
instances and needs to compare such instances among themselves. In code this usually looks like this:The drawback of this approach is that
ByteString.toByteArray()
creates a copy of theByteSrting
data into a newly allocatedbyte[]
array each time this method is called. This means that fasterFastByteComparisons.compareTo
is more than offset with the overhead of allocation/GC churn and memory copying. There are at least two alternative ways to compareByteString
instances lexicographically:The 1st alternative uses the
ByteString
underlyingbyte[]
array wrapped with a read-onlyHeapByteBuffer
which implementsComparable<ByteBuffer>
. This performs semantically the same comparison without copying and allocation of newbyte[]
arrays. JDK JIT is smart enough to see that the resultingHeapByteBuffers
do not escape the in-lined method and eliminates even the allocation of theHeapByteBuffer
instances, resulting in allocation-free comparisons.The 2nd alternative uses the
ByteString
's internalComparator
implementation which also performs allocation-free comparison with same semantics.I created a JMH benchmark to compare the 3 methods here:
https://gist.github.com/plevart/0e7c4f7b7df6d1de8f54df735ca8bc2d
The results are as follows (when running with JDK8):
https://jmh.morethan.io/?source=https://gist.githubusercontent.com/plevart/08c006ff4ecb22afa0982c4a5cd6f7ab/raw/06093d0a6d7fc578c5d8c90a15bef024127535ba/jmh-result-jdk8.json
And as follows (when running with JDK11):
https://jmh.morethan.io/?source=https://gist.githubusercontent.com/plevart/fb1c21a69a01ab7de3a1e5fb1f1d5f40/raw/1d763b5eaa4209618e802d5d7fe6cf0856bd2083/jmh-result-jdk11.json
As can be seen, the
unsignedLexicographicalComparator
variant does not produce garbage, but is also very inefficient when comparing ByteString(s) with large common prefix as it uses anIterator
-based iteration over individual bytes. But theasReadOnlyByteBuffer
variant looks very promissing. It does not produce garbage and is independent of the size of the ByteString suffix that differs. When comparing ByteStrings with long common prefix, theFastByteComparisons
withtoByteArray
still has an edge, but only when running on JDK 8. When running on JDK 9+ (tested with JDK 11), the ByteBuffer.compareTo() was optimized and uses special CPU support so theasReadOnlyByteBuffer
beats theFastByteComparisons
/toByteArray
in every situation by a factor of up to 30x.So WDYT?