hashgraph / hedera-services

Crypto, token, consensus, file, and smart contract services for the Hedera public ledger
Apache License 2.0
281 stars 124 forks source link

Compactions to clean old data which is known to be overridden in virtual node cache #9159

Open artemananiev opened 11 months ago

artemananiev commented 11 months ago

Today, MerkleDb compactions only process data flushed to disk. If an entity is changed at virtual map level, but the change is not flushed to the underlying data source yet, the previous entity value will be kept in MerkleDb forever (until the next flush). It works fine in current scenarios, but once we support in-memory virtual maps (for fast-changing data like transaction receipts), it may become a problem. Flushes to disk will be very rare, like once during reconnect. Even if an object is expired, updated or deleted in virtual node cache, it will be preserved in MerkleDb for no good reason. It would be great if compactions could detect such garbage and clean it up.

artemananiev commented 11 months ago

I am slightly skeptical about this, for a few reasons: