yahoo / HaloDB

A fast, log structured key-value store.
https://yahoodevelopers.tumblr.com/post/178250134648/introducing-halodb-a-fast-embedded-key-value
Apache License 2.0
508 stars 100 forks source link

Allocation-free reads #37

Open scottcarey opened 5 years ago

scottcarey commented 5 years ago

Every read allocates a byte[] on the java heap, even if all that read is going to do is deserialize that byte[] into something else.

It would be useful to be able to read the data directly without the intermediate byte[].

Perhaps with a signature similar to:

<A> A get(byte[] key, Function<DataInput, A> reader);

Access to a (native) ByteBuffer would also be useful, but this will become invalid if the file it points into is garbage collected. A different data structure could 'find' the memory again if it moved due to compaction and otherwise continue to use the old value. That might look like

ValueHandle getValueHandle(byte[] key);

interface ValueHandle {
  boolean updated(); // if the value was updated after the handle was created
  <A> A read(Function<DataInput, A> reader); // read whatever the current value is
}

The purpose of these would be to improve performance by decreasing allocations, and to allow for lazy-deserialization of larger data types. For example one might want to read only part of a value initially, and lazily load the remainder.