Performance optimizations bringing overall time of ./gradlew perfTests from 1.5 seconds to 700 milliseconds with 2 million records and 10 files.
Summary:
Since StreamIntersectMerger (Rename to IdIndexStreamMerger or something) is only used for longs, removed generics and use of Comparator in favor of direct comparisons
Sharing a byte buffer in the DualBufferBinaryRecordReader
Removed used of ArrayList in favor of array[] in StreamIntersectMerger
Cached result of module (%) operator in StreamIntersectMerger instead of performing operation in inner loop
Removed use of Optional in StreamIntersectMerger
Perform our own bitwise operations instead of using ByteBuffer in serializers
Add an offset parameter to deserializer to allow us to more efficiently operate on byte arrays containing tuples
Overview
Performance optimizations bringing overall time of
./gradlew perfTests
from 1.5 seconds to 700 milliseconds with 2 million records and 10 files.Summary:
StreamIntersectMerger
(Rename to IdIndexStreamMerger or something) is only used for longs, removed generics and use ofComparator
in favor of direct comparisonsDualBufferBinaryRecordReader
ArrayList
in favor of array[] inStreamIntersectMerger
StreamIntersectMerger
instead of performing operation in inner loopOptional
inStreamIntersectMerger
ByteBuffer
in serializers