BEESD fails to notice disk has disappeared.

I have been having problems with bees since migrating my deduped backup disk to another server, with the log filling up with alarming messages like 2022-10-04T00:00:38.426117+01:00 toaster beesd[5034]: crawl_transid[5157]: *** EXCEPTION *** 2022-10-04T00:00:38.426148+01:00 toaster beesd[5034]: crawl_transid[5157]: #011exception type std::system_error: BTRFS_IOC_TREE_SEARCH_V2: /run/bees/Keiths_Test_Archive at fs.cc:798: Input/output error 2022-10-04T00:00:38.426176+01:00 toaster beesd[5034]: crawl_transid[5157]: *** 2022-10-04T00:00:38.426896+01:00 toaster kernel: [48208.875653][ T5157] BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 4115, rd 162984, flush 0, corrupt 0, gen 0 2022-10-04T00:00:38.426908+01:00 toaster kernel: [48208.875681][ T5157] BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 4115, rd 162985, flush 0, corrupt 0, gen 0 2022-10-04T00:00:38.426909+01:00 toaster kernel: [48208.875831][ T5157] BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 4115, rd 162986, flush 0, corrupt 0, gen 0 2022-10-04T00:00:38.426910+01:00 toaster kernel: [48208.876788][ T5157] BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 4115, rd 162987, flush 0, corrupt 0, gen 0 2022-10-04T00:00:38.426732+01:00 toaster beesd[5034]: crawl_transid[5157]: Next root from 5 2022-10-04T00:00:38.426779+01:00 toaster beesd[5034]: crawl_transid[5157]: next_root(5) 2022-10-04T00:00:38.426808+01:00 toaster beesd[5034]: crawl_transid[5157]: insert_root(BeesCrawlState 5:0 offset 0x0 transid 43678..44691 started 2022-10-04-00-00-38 (0s ago)) 2022-10-04T00:00:38.426836+01:00 toaster beesd[5034]: crawl_transid[5157]: excess_roots.erase(5) 2022-10-04T00:00:38.427084+01:00 toaster beesd[5034]: crawl_transid[5157]: calling insert_new_crawl

Oh no! However of further inspection I spotted that following a system glitch, the /dev/sdc1 device had disappeared and the disk had been re-detected as /dev/sdd1. I'm still attempting to get to the bottom of the glitch. Here's the log extract FWIW

2022-10-01T22:13:02.272274+01:00 localhost kernel: [26479.051901][T27120] scsi 3:0:0:0: rejecting I/O to dead device 2022-10-01T22:13:02.272275+01:00 localhost kernel: [26479.052700][T27120] BTRFS: error (device sdc1) in btrfs_commit_transaction:2341: errno=-5 IO failure (Error while writing out transaction) 2022-10-01T22:13:02.272276+01:00 localhost kernel: [26479.052705][T27120] BTRFS info (device sdc1): forced readonly 2022-10-01T22:13:02.272277+01:00 localhost kernel: [26479.052707][T27120] btrfs_printk: 268 callbacks suppressed 2022-10-01T22:13:02.272277+01:00 localhost kernel: [26479.052708][T27120] BTRFS warning (device sdc1): Skipping commit of aborted transaction. 2022-10-01T22:13:02.272278+01:00 localhost kernel: [26479.052710][T27120] BTRFS: error (device sdc1) in cleanup_transaction:1945: errno=-5 IO failure 2022-10-01T22:13:02.900832+01:00 localhost kernel: [26479.678053][T28236] scsi 4:0:0:0: Direct-Access ST2000LM 007-1R8174 0 PQ: 0 ANSI: 6 2022-10-01T22:13:02.900854+01:00 localhost kernel: [26479.678962][T28236] sd 4:0:0:0: Attached scsi generic sg2 type 0 2022-10-01T22:13:02.900855+01:00 localhost kernel: [26479.679425][T28280] sd 4:0:0:0: [sdd] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB) 2022-10-01T22:13:02.900858+01:00 localhost kernel: [26479.679428][T28280] sd 4:0:0:0: [sdd] 4096-byte physical blocks

However beesd failed to notice that the disk it had been deduping had disappeared and carried on merrily failing for several hours until I stopped it manually. I humbly suggest that this is a bug, and that beesd really ought to notice when the subject of its attentions has vanished!

The disk passes the SMART tests, and "btrfs check" reveals no errors.

`toaster:~ # beesd -v Usage: beesd [options]

bees version 0.7

cat /etc/os-release NAME="openSUSE Leap" VERSION="15.4" toaster:~ # uname -a

Linux toaster 5.14.21-150400.24.21-default #1 SMP PREEMPT_DYNAMIC Wed Sep 7 06:51:18 UTC 2022 (974d0aa) x86_64 x86_64 x86_64 GNU/Linux

=== START OF INFORMATION SECTION === Model Family: Seagate Mobile HDD Device Model: ST2000LM007-1R8174 Serial Number: WDZA2HEL LU WWN Device Id: 5 000c50 0ac1c9d50 Firmware Version: SBK2 User Capacity: 2,000,398,934,016 bytes [2.00 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-3 T13/2161-D revision 3b SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Tue Oct 4 11:05:15 2022 BST SMART support is: Available - device has SMART capability. SMART support is: Enabled `

Zygo / bees

BEESD fails to notice disk has disappeared. #231