Currently BlockCacheIO caches byte arrays in equal sizes of blockSize/minIOSeekSize each. This provides faster HashMap inserts and lookups since keys themselves are just Long values (unique and no hash conflicts) but
It has the cost of needing to join multiple Array[Byte] if the required data is greater than minIOSeekSize.
Need multiple ConcurrentHashMap lookups if the required data is greater than minIOSeekSize.
Synchronised - java.nio.ScatteringByteChannel already provides an API which slices the data into smaller Array[Byte]s of minIOSeekSize each (so not in-memory slicing/copying is required) but this API still needs synchronised (blocking) access since ScatteringByteChannel mutates the position in the FileChannel.
Synchronised can be removed by not using ScatteringByteChannel and slicing data manually in-memory but this creates a lot more on-heap allocations which costs GC time.
Task
Another BlockCacheSkipListIO based on ConcurrentSkipList is required which allows caching randomly sized data. This will not need synchronised access and will provide easier caching but in comparison to ConcurrentHashMap, inserts and lookups will be slower which in some cases might not be noticeable since multiple lookups to join multiple byte arrays (in ConcurrentHashMap's cases) might not be required.
Configuration
Either one of the caching strategy should be configurable.
Overview
Currently
BlockCacheIO
caches byte arrays in equal sizes ofblockSize/minIOSeekSize
each. This provides fasterHashMap
inserts and lookups since keys themselves are justLong
values (unique and no hash conflicts) butArray[Byte]
if the required data is greater thanminIOSeekSize
.ConcurrentHashMap
lookups if the required data is greater thanminIOSeekSize
.Synchronised
-java.nio.ScatteringByteChannel
already provides an API which slices the data into smallerArray[Byte]
s ofminIOSeekSize
each (so not in-memory slicing/copying is required) but this API still needssynchronised
(blocking) access sinceScatteringByteChannel
mutates theposition
in theFileChannel
.Synchronised
can be removed by not usingScatteringByteChannel
and slicing data manually in-memory but this creates a lot more on-heap allocations which costs GC time.Task
Another
BlockCacheSkipListIO
based onConcurrentSkipList
is required which allows caching randomly sized data. This will not needsynchronised
access and will provide easier caching but in comparison toConcurrentHashMap
, inserts and lookups will be slower which in some cases might not be noticeable since multiple lookups to join multiple byte arrays (inConcurrentHashMap
's cases) might not be required.Configuration
Either one of the caching strategy should be configurable.