couchbase / moss

moss - a simple, fast, ordered, persistable, key-val storage library for golang
Other
959 stars 58 forks source link

partial compaction of index/kvs only. #41

Open steveyen opened 7 years ago

steveyen commented 7 years ago

In the segment.kvs array, there's 16^H^H 8 bits that are currently unused / RESERVED for the future, out of the 128 bits for each entry... https://github.com/couchbase/moss/blob/master/segment.go#L149

Maybe we can use those 16^H^H 8 bits somehow. One thought is as moss is appending dirty segments to the end of the file, perhaps it might opportunistically sometimes decide to compact some of the kvs arrays of the segment stack into a big kvs array. That is, leave all the old, already persisted buf's as-is/untouched, but the latest "compacted kvs" would use (some of?) the 16 bits to refer to the old buf that each entry points to.

The thinking is instead of performing binary search on the multiple kvs arrays down through the segment stack, there'd be only a single kvs array to binary search.

(Update: 16 -> 8 bits (thanks marty!); also if 8 bits ain't enough, then it's possible that keys and vals at runtime might not be near the max-keyLen and max-valLen limits, so that might be another place to get bits; and another place might be the offset into the buf)

steveyen commented 7 years ago

A related thought is if that if the max key-len in a segment is less than 24-bits, then perhaps those bits can also be used for other purposes, like the above idea to reference a buf.

mschoch commented 7 years ago

Before you go spending bits we don't have, my understanding is that we only have 8 bits left over.

64 - data offset 4 - opteration type 24 - key len 28 - val len 8 - reserved

Also those 8 reserved bits are not contiguous, 4 are near the operation and 4 are near the value len.

steveyen commented 7 years ago

hehe, indeed! that's why i shouldn't be in accounting :-)

hisundar commented 7 years ago

The kvs/index arrays can only be merged when there are no key space overlaps right?

steveyen commented 7 years ago

The kvs/index arrays can only be merged when there are no key space overlaps right?

Hi, unless I'm too pre-coffee about this, I'm thinking they can be merged, even when there are overlaps. That is, a kvs entry currently has info like this...

op, bufOffset, keyLen, valLen

And the proposal is to add one more thing, like...

op, bufIdx, bufOffset, keyLen, valLen

The bufIdx would tell us which buf to use.

That way, at read time, there's less binary searching... that is, currently, moss will binary search through multiple kvs arrays as it moves down through the segment stack. Instead, it would just binary search through a single kvs array (which refers to multiple buf's).

One wrinkle (related to my lack ability to account correctly) is w.r.t. how many bits we can afford to give to a bufIdx field.