Open chillenb opened 4 months ago
Superblock updates are written last in a transaction commit, so normally any roots that might be found with a newer or older generation are roots of incompletely written or partially overwritten (and therefore unusable) trees. The backup root might be usable in addition to the current root, but for any other root, another very non-trivial step is required to assemble incomplete trees and try to find a chunk tree that is consistent with the rest of the metadata.
There is one exception: if multiple commits are queued to a device with non-working write barriers, then the commits and tree roots of several generations may be lost, requiring a backward search starting at the highest detected generation number and stopping when (or if) an intact tree is found. On the other hand, in this scenario there will so much damage to metadata (affecting the roots of trees from multiple transactions) that any chance of finding a usable tree is low.
What you described actually happened to me recently on an ssd in a laptop. Recovery was not pleasant (thus #749 ) but using DMDE I was able to find a chunk tree that btrfs rescue chunk-recover
deemed OK. Then by manually pointing the superblock to root tree roots starting at the highest generation I got btrfs restore
to read most of the important data off.
The backward search method you describe is basically what I did by hand. I was very surprised that it worked so well, and that's why I think it should be automated. Even if it doesn't work most of the time, it'd be worth it if it helped a few people!
The current hard-coded behavior is to ignore parts of the chunk tree that are newer than the generation provided in the superblock. This is not always right! There should be an option to handle cases where superblock updates were not written to disk during a power failure.