Open teor2345 opened 1 year ago
Is this ticket available for work? @teor2345 @mpguerra
Is this ticket available for work? @teor2345 @mpguerra
Yes, it is
Is this ticket available for work? @teor2345 @mpguerra
Yes, it is
Awesome. I just created a PR.
Motivation
Some Zebra users are concerned about the size of the on-disk database, particularly miners (#5718). Others are concerned about memory usage. Zebra developers also need to monitor database and column family sizes as part of state upgrades.
It would be useful to print the total database size, and the size of each column family, on disk and in memory.
We could print it at startup and shutdown.
Specifications
There are RocksDB APIs for each column family: https://docs.rs/rocksdb/latest/rocksdb/struct.DBCommon.html#method.property_int_value_cf
We can get live and total disk size using these properties: https://docs.rs/rocksdb/latest/rocksdb/properties/constant.ESTIMATE_LIVE_DATA_SIZE.html https://docs.rs/rocksdb/latest/rocksdb/properties/constant.TOTAL_SST_FILES_SIZE.html
And memory size (why not?) using this property: https://docs.rs/rocksdb/latest/rocksdb/properties/constant.SIZE_ALL_MEM_TABLES.html
Complex Code or Requirements
To get the total size, we need to iterate through each column family, including the default column family, then add the values.
Testing
Manually compare the total with the size on disk using
du
, and the size in memory usingtop
.RocksDB uses extra files for old data and deleted data, so the RocksDB disk sizes should be smaller. Live disk should also be smaller than total disk.
Zebra uses memory outside RocksDB, so the RocksDB memory usage should be smaller.
Related Work
We might also want to print the memory or disk usage regularly, but that's out of scope for this ticket. Memory usage can vary a lot depending on what operations Zebra is doing.