kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
553 stars 242 forks source link

Place of error during send/receive is unknown #163

Open mikhailnov opened 5 years ago

mikhailnov commented 5 years ago
root@pay2:/tmp# btrfs send -v sdb1/@libvirt-images_transfer20190203 | btrfs receive -v sdg1/ 

At subvol sdb1/@libvirt-images_transfer20190203
At subvol @libvirt-images_transfer20190203
receiving subvol @libvirt-images_transfer20190203 uuid=fbaadd0d-5087-f148-87db-d8ec1243ae0f, stransid=1972
ERROR: send ioctl failed with -5: Input/output error
ERROR: unexpected EOF in stream

I had hardware problems with the drive /dev/sdb1, mounted to sdb, and when transfering data from it to a new drive I encountered this problem. After the error occured:

root@pay2:/tmp# ls sdb1/@libvirt-images_transfer20190203
ALT_sisyphus_MATE.qcow2  CentOS_7_min.qcow2  freebsd11.1_disk1.qcow2  Lubuntu_12.04_amd64.qcow2  Rosa_CentOS_Desktop.qcow2  Rosa_DX_Chrome_2012.qcow2  Rosa_XFCE_24409.qcow2  Win8.1_Russian_x32_MSDN.iso  WinXP_RDP.qcow2

root@pay2:/tmp# ls sdg1/@libvirt-images_transfer20190203
CentOS_7_min.qcow2  freebsd11.1_disk1.qcow2

There seems to be no way to understand where exactly the problem is or to force btrfs to skip errors. The subvolume cannot be copied because of one problem somewhere.

root@pay2:/tmp# btrfs --version
btrfs-progs v4.19.1 
root@pay2:/tmp# uname -a
Linux pay2.loc 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
mikhailnov commented 5 years ago

From dmesg:

[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1448751104 csum 0xc36be6d7 expected csum 0x50c6ae8c mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1449734144 csum 0xa92318c5 expected csum 0x509b730e mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1448779776 csum 0xeda6c0ce expected csum 0x6852c196 mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1449918464 csum 0x703ec23e expected csum 0xa663e089 mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450094592 csum 0x1a85dce4 expected csum 0x76e46a65 mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450807296 csum 0x48027cb7 expected csum 0x59607604 mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450274816 csum 0x93b37050 expected csum 0xf43b928c mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450950656 csum 0x1cf87eed expected csum 0x353f355c mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450508288 csum 0x1a85dce4 expected csum 0x8829487b mirror 1
[Вс фев  3 05:23:53 2019] BTRFS warning (device sdb1): csum failed root 718 ino 258 off 1450471424 csum 0x6856e822 expected csum 0xf3b1a960 mirror 1
mikhailnov commented 5 years ago

/bin/cp -v (copying by cp instead of tranfering the whole subvolume) showed that the error was in sdb1/@libvirt-images_transfer20190203/freebsd11.1_disk1.qcow2 (input/output error).

Seb35 commented 4 years ago

I had a similar question, but I’m not sure it falls under the responsability of btrfs after all. In my case the snapshots fail after I recently expanded the btrfs partition and SMART tools are saying me there is at least one unreadable sector, probably located in the new space. During the btrfs send/receive, I had this error in syslog print_req_error: I/O error, dev sda, sector 678905408 (amongst other lines with other details).

I tried to copy with cp -v but it copied without returning any error, so it was not helpful for me. The strict original question here can be solved with btrfs receive -vv /dest, returning things like:

utimes boot
utimes etc
utimes media
utimes var
…

I will try to delete the last file mentionned in my case, but I don’t have much hope it will change anything. I have recent backups, it’s not critical in my case; next steps for me are to execute badblocks and/or btrfs check.

Seb35 commented 4 years ago

Contrary to my previous comment, the last file mentionned in the log of btrfs receive -vv /dest was indeed the faulty one (I thought the log was written after the file itself was written, but it probably written before, then it can be used to know beforehand what file will be written and take action if necessary). So it fully answer your initial requirement (and mine).

I deleted the faulty file (3 Gio was written on the destination disk over a total of 20 Gio, it was a DB backup), and the snapshots created without this file could be transfered without issue. Obviously I have to take other actions to avoid the system tries to use it again. (btrfs scrub categorised it as uncorrectable_error.)

Seb35 commented 4 years ago

To maintainers and/or @mikhailnov: this issue can probably be closed, at least if you aggree btrfs receive -vv /dest is the correct way to answer.