holiman / billy

Very simple datastore
BSD 3-Clause "New" or "Revised" License
50 stars 7 forks source link

db, shelf: allow repairing the db by dropping corrupted data #23

Closed karalabe closed 4 months ago

karalabe commented 4 months ago

Assuming that the shelf magic is correct (we fsynced the header), the contents of the shelf might be corrupted in a variety of ways:

The 3rd case is not Billy's problem. Billy is a "dumb" data store, so if something got corrupted whilst maintaining the database schema, the outer process needs to deal with it. In Geth's case, this is handled: if a blob cannot be decoded, it will be dropped.

The first 2 cases however need to be handled by Billy and currently are not. In both cases, Billy will fail on startup when iterating the content. Whilst we could argue that failing and letting the outer user resolve it is not a bad thing, there's also no real way to automatically resolve these by Geth: we don't want to know the internal structure of the db, and going in hot deleting an entire folder seems a nuclear option.

This PR instead adds a repair mode into Billy. If the database is opened in RW mode and repairing is requested, then the above two scenarios will be fixed:

In both these scenarios, the repair is destructive. That said, for Geth's use case this is fine, but even more in general, if the database dat format is borked, theres not much more we can do really. A partial data loss seems preferable vs a total data loss.

codecov[bot] commented 4 months ago

Codecov Report

Attention: 15 lines in your changes are missing coverage. Please review.

Comparison is base (33af227) 86.78% compared to head (806ab98) 84.01%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #23 +/- ## ========================================== - Coverage 86.78% 84.01% -2.77% ========================================== Files 5 5 Lines 401 413 +12 ========================================== - Hits 348 347 -1 - Misses 37 48 +11 - Partials 16 18 +2 ```