attestantio / dirk

Apache License 2.0
81 stars 23 forks source link

Pruning slashing protection storage #13

Closed freshfab closed 3 years ago

freshfab commented 3 years ago

Hey! I'm wondering how to handle pruning of the database for information on slashing protection since dirk doesn't provide any options for it.

First, I'd like to know whether there's a reason for not having such an option? If not, up to which point is it safe to prune the database then?

Thanks!

mcdee commented 3 years ago

The internal data does not grow, as it contains only the information about the latest attestations and block proposals. However, the on-disk format can grow over time. I will look in to how best to provide the database maintenance process (either internally as a dirk command or externally as a procedure) and update the repository accordingly.

mcdee commented 3 years ago

Turns out that this can be carried out in-process. #14 addresses this issue.

freshfab commented 3 years ago

Thanks for taking care of it so fast! So, just to clarify: GC is only done on startup? Any chance you could add a --gc hh:mm:ss flag, so GC can be run periodically without restarts?

mcdee commented 3 years ago

We did consider running a periodic garbage collection, however a couple of things weighed in against this. First, the database doesn't grow very quickly, so even with a few GB of storage it will happily run for months. Realistically, with modern storage it isn't too much of a concern.

Second, and more important, periodic garbage collection could have an effect on the time taken to return signatures at random intervals. Dirk is designed to be "front-heavy", where the most expensive operations take place early on in its runtime (most notably key decryption). A single instance of Dirk is expected to take a couple of epochs to warm up, after which it will be able to respond to requests at top speed and without interruption.