Open artemananiev opened 11 months ago
I am slightly skeptical about this, for a few reasons:
Today, virtual map and data source are two different layers, very well isolated from each other, with clean API defined. To implement what's in this ticket, this separation may become less strict
For purely in-memory virtual maps, it would make more sense to make sure nothing is flushed to disk, ever, rather than to flush and then clean up
For regular virtual maps, which are not very often updated (so flushes are rare), size of data files on disk shouldn't be very large. Cleaning some extra data wouldn't save much disk space or other resources
Today, MerkleDb compactions only process data flushed to disk. If an entity is changed at virtual map level, but the change is not flushed to the underlying data source yet, the previous entity value will be kept in MerkleDb forever (until the next flush). It works fine in current scenarios, but once we support in-memory virtual maps (for fast-changing data like transaction receipts), it may become a problem. Flushes to disk will be very rare, like once during reconnect. Even if an object is expired, updated or deleted in virtual node cache, it will be preserved in MerkleDb for no good reason. It would be great if compactions could detect such garbage and clean it up.