Zygo / bees

Best-Effort Extent-Same, a btrfs dedupe agent
GNU General Public License v3.0
654 stars 55 forks source link

Endless loop when first started with empty filesystem (0.6.1) #93

Open charles-dyfis-net opened 5 years ago

charles-dyfis-net commented 5 years ago

This was discovered when writing a NixOS system test for the bees module.

The testing logic in https://github.com/NixOS/nixpkgs/blob/923a3e4970226293e4698e44e3e5d5ccf7487603/nixos/tests/bees.nix consistently succeeds every time: This code first creates files on a new filesystem, then starts the bees service. That (passing) test is roughly equivalent to the following shell script:

any_shared_space() {
  [[ $(btrfs fi du -s --raw "$@" | awk 'NR>1 { print $3 }' | grep -E '^0$' | wc -l) -eq 0 ]]
}
die() { echo "$*" >&2; exit 1; }

mkfs.btrfs -f -L aux /dev/vdb || die
mount /dev/vdb /home || die
dd if=/dev/urandom of=/home/dedup-me-1 bs=1M count=8 || die
cp --reflink=never /home/dedup-me-1 /home/dedup-me-2 || die
any_shared_space /home/dedup-me-* && die "ERROR: Detecting shared space before any deduplication has been done"
sync
systemctl start beesd@aux.service
sleep 10
any_shared_space /home/dedup-me-* || die "ERROR: No shared space detected even after bees is running"

By contrast, a test akin to the following -- which starts the service after the filesystem is created and initially mounted, but before any content has been created -- consistently fails, with bees running in a loop which is trying to poll the status of a file descriptor referring to a file that doesn't exist:

any_shared_space() {
  [[ $(btrfs fi du -s --raw "$@" | awk 'NR>1 { print $3 }' | grep -E '^0$' | wc -l) -eq 0 ]]
}
die() { echo "$*" >&2; exit 1; }

mkfs.btrfs -f -L aux /dev/vdb || die
mount /dev/vdb /home || die
systemctl start beesd@aux.service
sleep 1
dd if=/dev/urandom of=/home/dedup-me-1 bs=1M count=8 || die
cp --reflink=never /home/dedup-me-1 /home/dedup-me-2 || die
sync
sleep 10
any_shared_space /home/dedup-me-* || die "ERROR: No shared space detected even after bees is running"

The actual failing test can be found at https://github.com/charles-dyfis-net/nixpkgs/blob/bees-test-failing/nixos/tests/bees.nix; if checking out the relevant branch of nixpkgs, it can be run (from the root of that working tree) with nix-build -I nixpkgs="$PWD" ./nixos/tests/bees.nix.

strace of the loop taking place when in that failed state can be seen at https://gist.github.com/charles-dyfis-net/34ac2e4d2bada0c3a3c8632cab98c8d9

charles-dyfis-net commented 5 years ago

The title is a bit misleading, as the filesystem is not quite empty even in the failing case -- we've still had a .beeshome subvol created by the wrapper, after all.

Zygo commented 5 years ago

Do you have the log output from the failing case? Among other things it will have the context leading up to the exception.

charles-dyfis-net commented 5 years ago

Edited an additional file log.txt into the previously-linked gist.

Zygo commented 5 years ago

So it's failing to find the extent tree in the root tree. Fascinating.

Maybe there's some race condition during the first btrfs mount where transid 1 or the extent tree doesn't exist yet?

In any case, some other things are going wrong after that: somehow we get a transid_max of 18446744073709551615 (maybe need something like 90132182fdb2c06a2811ce9d45c54046adaf536d for transid_max()). After that, nothing works because we're assuming the largest possible transaction rate and won't accept any smaller transid than the maximum possible transid.

It should be possible to work around this by forcing a sync after creating .beeshome.

charles-dyfis-net commented 5 years ago

Confirmed: Adding that sync (as in charles-dyfis-net/nixpkgs@5ed17dfb01e82db9f78c2f78a694c321ba07c697) does indeed work around the immediate issue.