extend snapshot format with the vector record

currently the rdb format uses pretty naive compression with its lzf* code. This wastes too much CPU and achieves little. In fact, when we save a snapshot using "SAVE DF" into local SSD we are bottlenecked on CPU and not on I/O.

We should introduce RDB_TYPE_VEC that is an uber record that "hosts" n records. Those n records will be serialized using the same compression context into a single blob. In fact 2 blobs: one for concatenated keys and one for concatenated values.

The record should answer the following requirements:

its compressed blobs should be no bigger than kMaxCompress bytes. kMaxCompress is a manageable constant like 65KB.
its uncompressed working buffer should have sane limits, i.e. no bigger than kMaxUncompressBytes. See https://github.com/romange/gaia/blob/master/util/coding/block_compressor.h#L10 on how we limit decompression resources, for example. The thing is that decompression algorithms reference the already decompressed buffer from before, therefore we should limit the reference window size during compression to avoid buffers too large.
The compressed blob should be P percent and k bytes smaller than the total size of n uncompressed records. If it's not smaller then we should decompress and store all records as is without compression, similarly to how it's done here: https://github.com/dragonflydb/dragonfly/blob/main/src/server/rdb_save.cc#L648
We should use zstd compression algorithm as the most balanced compression algorithm existing today.

We should start with a design document explaining how a RDB_TYPE_VEC record look like in serialized format, so it is possible to deserialize it back. We should compress keys and values separately because each one of them has different biases. We also should write the lengths for keys and values of all the records so that we could restore them from uncompressed blobs.

For simplicity, we can assume that we use RDB_TYPE_VEC only for gathering string types, though we should also serialize record types so that we could easily extend this to other types as well.

dragonflydb / dragonfly

extend snapshot format with the vector record #486