dotmesh-io / dotmesh

dotmesh (dm) is like git for your data volumes (databases, files etc) in Docker and Kubernetes
https://dotmesh.com
Apache License 2.0
539 stars 29 forks source link

#766 zfs diff caching fix #778

Closed itamarst closed 4 years ago

itamarst commented 4 years ago

Fixes #766.

Previously the logic for caching diffs was broken. If not changes were made to a filesystem, the cached snapshot was erroneously deleted and recreated each time, breaking caching. This PR tries to fix it, and along the way simplifies the logic.

  1. Updated GetDirtyDelta to be much simpler. We calculate dirty bytes for a filesystem with a simpler method, written@<snapshotname>, which returns how many bytes were written to the filesystem since the snapshot.
  2. I suspect the logic in Diff for comparing whether tmp snapshot has changed was buggy: the dirty delta was being calculated on latest real snapshot, instead of tmp diff snapshot. So now we calculate dirty bytes vs. the tmp snapshot.
  3. Added a bunch of tests.

As it turns out, plausibly only the second item was necessary to fix the bug, but... the first item makes everything much simpler.

For item 1, based on my interpretation of what GetDirtyDelta does, I've decided to use written@<snapshot> for dirty, and referenced for total. Here's a comparison of old and new code:

new dirty 0 == old dirty 0? new total 24576 == old total 24576?
new dirty 13824 == old dirty 13824? new total 26112 == old total 38400?
new dirty 14336 == old dirty 15360? new total 24576 == old total 39424?
new dirty 12800 == old dirty 12800? new total 25088 == old total 37888?

So similar but not identical.

For reviewers: please ensure my interpretation of what GetDirtyDelta does is correct—it previously wasn't documented so I had to guess.

lukemarsden commented 4 years ago

Looks great, thanks! Let's get this rolled out to prod.