Open akrantz01 opened 4 years ago
I think this is most likely a library issue. Right now, after issuing the snapshot creation we try to get the subvolume at that path https://github.com/cezarmathe/btrfsutil-rs/blob/97aea3044889b3c0984a3403dcc84f47f8d87ab5/src/subvolume/subvol.rs#L248 and that may be too fast. I think this could be fixed by using an async transaction id here https://github.com/cezarmathe/btrfsutil-rs/blob/97aea3044889b3c0984a3403dcc84f47f8d87ab5/src/subvolume/subvol.rs#L243 and then waiting on it(or don't wait, but that will be in the async version of the library).
I managed to reproduce the error (though it might be a different one) and figure out the cause.
First, version 0.1.0
was failing with the error "Could not search B-tree". I don't remember what exactly the reason was, but eventually I had a thought to upgrade to the current master (003fa612bbacad1ef4c54abc6d817f0eab4ec8c2
) and that error went away.
Another one appeared instead: "Could not create subvolume", with errno
(which I had to extract manually using nix
, by the way; it might be a good idea to expose it somehow in this library's errors) having been set to EOPNOTSUPP. This one I managed to trace all the way down to the implementation of the ioctl function used to create snapshots.
The function btrfs_util_create_snapshot
from libbtrfsutil
accepts a *mut u64
and is supposed to write into it the BTRFS transaction id of the transaction in which the snapshot was created. However...
https://github.com/kdave/btrfs-devel/blob/92477dd1faa650e50bd3bb35a6c0b8d09198cc35/include/uapi/linux/btrfs.h#L40-L41
... the ioctl implementation in the kernel no longer supports that! The feature has been deprecated since 5.7, as evidenced by the linked comment. And, indeed, when I tried to call the generated binding (from btrfsutil-sys 1.2.1
) directly without specifying the async, it worked.
First, version
0.1.0
was failing with the error "Could not search B-tree". I don't remember what exactly the reason was, but eventually I had a thought to upgrade to the current master (003fa612bbacad1ef4c54abc6d817f0eab4ec8c2
) and that error went away.
AFAIK that error was caused by attempting to always use a function that required superuser permissions https://github.com/cezarmathe/btrfsutil-rs/blob/375010d9aa41db96a44eb9b56792b18d5877cbae/src/subvolume/subvol.rs#L234 which uses https://github.com/cezarmathe/btrfsutil-rs/blob/375010d9aa41db96a44eb9b56792b18d5877cbae/src/subvolume/subvol.rs#L168
I think this was fixed on master
, but there isn't any published version with a fix for this.
The function
btrfs_util_create_snapshot
fromlibbtrfsutil
accepts a*mut u64
and is supposed to write into it the BTRFS transaction id of the transaction in which the snapshot was created. However... https://github.com/kdave/btrfs-devel/blob/92477dd1faa650e50bd3bb35a6c0b8d09198cc35/include/uapi/linux/btrfs.h#L40-L41 ... the ioctl implementation in the kernel no longer supports that! The feature has been deprecated since 5.7, as evidenced by the linked comment. And, indeed, when I tried to call the generated binding (frombtrfsutil-sys 1.2.1
) directly without specifying the async, it worked.
This is (or was?) the libbtrfsutil
version supported by this library: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/btrfsutil.h#L28-L30
Here there is no sign that async transaction IDs are not supported: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/btrfsutil.h#L415-L431. Are you saying that support for async transactions has been dropped?
By the way, I looked over the implementation and I might have found an issue.
https://github.com/cezarmathe/btrfsutil-rs/blob/003fa612bbacad1ef4c54abc6d817f0eab4ec8c2/src/subvolume/subvol.rs#L294-L306
https://github.com/cezarmathe/btrfsutil-rs/blob/003fa612bbacad1ef4c54abc6d817f0eab4ec8c2/src/subvolume/subvol.rs#L308
The transaction is set to 0
- essentially NULL
(might've been better to use std::ptr::null()
tbh). After that, the library uses btrfs_util_wait_sync
to wait on a NULL
async transaction id.
I haven't tested this, but I feel like there's something wrong with that.
Another one appeared instead: "Could not create subvolume", with
errno
(which I had to extract manually usingnix
, by the way; it might be a good idea to expose it somehow in this library's errors) having been set to EOPNOTSUPP.
At the moment the library returns the error as reported by libbtrfsutil
(enum btrfs_util_error
). How do you suggest integrating the errno
as well in the reported errors?
Thanks for sharing your findings!
This is (or was?) the
libbtrfsutil
version supported by this library: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/btrfsutil.h#L28-L30Here there is no sign that async transaction IDs are not supported: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/btrfsutil.h#L415-L431. Are you saying that support for async transactions has been dropped?
libbtrfsutil
is a wrapper around the actual btrfs code. To do the actual creation of the snapshot, it invokes the kernel code through ioctl: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/subvolume.c#L1174
It's that kernel code that's supposed to create the snapshot and return the transaction id by setting a field in the args
struct, which is defined here: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/ioctl.h#L98. That field is also present in the parallel definition in the btrfs-devel repo: https://github.com/kdave/btrfs-devel/blob/92477dd1faa650e50bd3bb35a6c0b8d09198cc35/include/uapi/linux/btrfs.h#L130
However, if you trace the ioctl invocation:
You'll find the place where EOPNOTSUPP is returned. It seems that libbtrfsutil passes an unsupported flag — namely, the one which asks the kernel code to return the transaction. It's set here: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/subvolume.c#L1156-L1157. That flag is no longer supported though (its definition is commented out in btrfs-devel), and you won't find any usages of transid
in the ioctl implementation.
The transaction is set to
0
- essentiallyNULL
(might've been better to usestd::ptr::null()
tbh). After that, the library usesbtrfs_util_wait_sync
to wait on aNULL
async transaction id.
That's not quire correct. You declare a variable and set its value to 0. Then you send the address of that variable to libbtrfsutil — and the address is not zero/nullptr. The libbtrfs function detects that, adds the flag for ioctl and, if the ioctl invocation is successful, uses the pointer to write the transaction id into your variable, thus overwriting the zero that you initialized it with: https://github.com/kdave/btrfs-progs/blob/471b4cf7e3a46222531a895f90228ea164b1b857/libbtrfsutil/subvolume.c#L1178-L1179. After that, it'd be correct to use the newly written value of the variable to wait for the transaction. So, to summarize, your usage of the feature is correct; it's just that the feature is no longer supported by the new kernel code, even though libbtrfsutil still exposes it. It would work with an older version of the kernel (presumably up to version 5.7).
At the moment the library returns the error as reported by
libbtrfsutil
(enum btrfs_util_error
). How do you suggest integrating theerrno
as well in the reported errors?
Option 1: add a cause
to LibError
and fill it with nix::errno::Errno
(as returned by e. g. Errno::last()
) (which conveniently implements Error
). So it'd be a "btrfs error such-and-such" caused by "system error such-and-such".
Option 2: make LibError a composite type which has both a btrfs_util_error
value and an Errno
. Would require a nontrivial Debug implementation, but it shouldn't be too complex.
In any case, be sure to handle the case of errno 0 (aka "no error") :)
For some reason, I can't take a snapshot of a volume using
Subvolume::snapshot
. I keep getting a SearchFalied response no matter the path I provide as the destination. Here is the code I am using to take the snapshot:/mnt
is a the root of a btrfs filesystem with a subvolume atimages/alpine
. When I try it using thebtrfs
command line, it works with no issue. Even when I call it usingstd::process::Command
it works properly. Am I doing something wrong, or is it within the library?Edit: I should also add that I am running the program as root, so permissions shouldn't be an issue?