Open yihuang opened 1 year ago
deprioritize this one to avoid premature optimization, uncompressed nodes has fixed size, had the advantage of simplicity.
now the simpler design is about to release, more sophisticated version can be considered now (if we really want to dig this rabbit hole) ;D
it seems filesystem level compression works well with memiavl, we probably don't need to worry about this issue at all.
$ sudo compsize -x /chain/.chain-maind/data/memiavl.db
Processed 865 files, 320027 regular extents (320027 refs), 416 inline.
Type Perc Disk Usage Uncompressed Referenced
TOTAL 41% 23G 56G 56G
none 100% 17G 17G 17G
zstd 14% 5.8G 38G 38G
Do Not Worry About.
On Tue, 11 July 2023, 7:22 pm yihuang, @.***> wrote:
it seems filesystem level compression works with mmap, if it works well, we probably don't need to worry about this issue at all.
— Reply to this email directly, view it on GitHub https://github.com/crypto-org-chain/cronos/issues/827#issuecomment-1630468974, or unsubscribe https://github.com/notifications/unsubscribe-auth/AWNKBY4UMXQ4LELYBWMVZDDXPULNDANCNFSM6AAAAAAUJLUC6M . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Currently for simplicity, the snapshot format is plain data without any compression, compression is important to reduce the size.
nodes
, each node is just a bunch of integers together with 32bytes hash, there are lots of zero bytes to compress, we can compress each node independently and add 1 byte length prefix, nodes are referenced by file offset, candidates:RLECap'n Proto packing schemakeys
, a bunch of short and ordered bytes, frequent access, delta encoding should be efficient here, then we need to organize the data in small fixed size chunk, and support looking up the key by index, rather than uncompressed file offset.values
, unordered, less frequent access, can apply some generic random accessible compression like zstd seektable format, still support look up by uncompressed file offset.(node.key, node.version)
, if versiondb is integrated with IAVL tree closely.