Closed aidan-gibson closed 3 weeks ago
Replace normally leaves behind an item so it can report things like this:
Started on 8.Dec 02:59:46, finished on 10.Dec 11:38:44, 0 write errs, 0 uncorr. read errs
The check currently checks to see if the item exists:
key.objectid = 0;
key.type = BTRFS_DEV_REPLACE_KEY;
key.offset = 0;
ret = btrfs_search_slot(NULL, dev_root, &key, &path, 0, 0);
btrfs_release_path(&path);
if (ret < 0) {
errno = -ret;
error("failed to check the dev-reaplce status: %m");
return ret;
}
if (ret == 0) {
error("running dev-replace detected, please finish or cancel it.");
return -EINVAL;
}
but that's not sufficient. It should also check the contents of the item to see if replace is finished, paused, still running, etc.
[edited to fix typo]
Thank you so much Zygo! I commented out lines 82-85 in btrfs-progs/tune/change-csum.c
https://github.com/kdave/btrfs-progs/blob/5d97c32d6f94cf6f473a5f82964e3edaeb1b146e/tune/change-csum.c#L83 ,rebuilt, and changing checksums is now currently running.
Thanks for the report. Fixed in devel.
There is no dev-replace running, although I have replaced devices on this fs before.
Note: This fs is actually two drives using RAID 0
I also tried this a while back, about a year ago I believe, and got the same error. Maybe I botched the dev-replaces in the past? How can I troubleshoot further?
I have already run
sudo btrfs check --check-data-csum --progress /dev/sdd
without errors (took several days to run).I am using the latest btrfs-progs, built from git. I have also successfully changed to xxhash64 on my root drives, also RAID1, on the same machine. It really seems like there is something messed up in the metadata (or something) for Data11.
Very willing to do the legwork on this, just tell me what to do. Thanks!