threefoldtech / 0-stor_v2

Apache License 2.0
3 stars 1 forks source link

Improve rebuild logic #5

Open LeeSmet opened 3 years ago

LeeSmet commented 3 years ago

The current rebuild logic is fairly simple: retrieve data, reencode, and send back to the new backends. However, we can check if any of the new backends is also used in the old metadata. If it is, we can assign it the same shard, eliminating the write to that backend, saving some space.

Need toch check if encoding is deterministic for this, especially if it is a parity shard

scottyeager commented 4 days ago

IMO this is fairly essential. Under the current scheme, the backend data usage is multiplied by the number of rebuild operations that have been carried out, plus one for the initial write. So in the case of an initial backend configuration with some data stored, replacing a single backend and rebuilding means doubling the data usage in all of the backends that didn't get replaced, since a duplicate of all data is written to them again.