[CuBox-I] xbian-copy doesn't work

CurlyMoo commented 10 years ago

error1 error2 error3

CurlyMoo commented 10 years ago

Can you maybe post the steps to manually run the clone. Than i can also better help debugging.

mk01 commented 10 years ago

@CurlyMoo

this is FAT16 ? there simply are filesystems without needed ioctl support (for instance ftruncate in that case)

btrfs-auto-snapshot --help

CurlyMoo commented 10 years ago

These were NFS and NTFS targets.

mk01 commented 10 years ago

how do you mark more lines on source to comment ? I would check the block handling img file creation (create, preallocate space, create partitions, format, mount for copying)

I can mark block with click line -> hold shift + click another line. but how to trigger then any action on it (to comment, send as link or whatever you do sometimes)?

CurlyMoo commented 10 years ago

If you select blocks of lines, the lines numbers are made into html anchors. So just select those lines and copy the full url.

mk01 commented 10 years ago

@CurlyMoo

https://bugs.launchpad.net/nova/+bug/1024586

that must have been broken by any recent (or older) update to kpartx or losetup or dmsetup. so using kpartx directly on file doesn't work anymore. but just tested with manual losetup before that and then kpartx on loop. that works.

(ubuntu saucy has also 0.4.9 version of kpartx and works ok (3.14 and 3.15 kernels - the same I run on RPI))

mk01 commented 10 years ago

@CurlyMoo

xbian-package-config-shell 2.1.8-2 tested on NFS

CurlyMoo commented 10 years ago

Nope: error4 error5

CurlyMoo commented 10 years ago

Additionally

[  750.099037] device label xbian devid 1 transid 1339 /dev/mmcblk0p2
[  750.351832] device label xbian devid 1 transid 1339 /dev/mmcblk0p2
[  750.457710] device label xbian devid 1 transid 1339 /dev/mmcblk0p2
[  750.618551] device label xbian devid 1 transid 1339 /dev/mmcblk0p2
[  750.815358] device label xbian devid 1 transid 1339 /dev/mmcblk0p2
[  755.372570] bio: create slab <bio-2> at 2
[  755.509187] EXT4-fs (loop2): VFS: Can't find ext4 filesystem
[  755.509412] EXT4-fs (loop2): VFS: Can't find ext4 filesystem
[  755.509583] EXT4-fs (loop2): VFS: Can't find ext4 filesystem
[  755.509827] FAT-fs (loop2): bogus number of reserved sectors
[  755.509835] FAT-fs (loop2): Can't find a valid FAT filesystem
[  755.513348] EXT4-fs (loop3): VFS: Can't find ext4 filesystem
[  755.513542] EXT4-fs (loop3): VFS: Can't find ext4 filesystem
[  755.513713] EXT4-fs (loop3): VFS: Can't find ext4 filesystem
[  755.513933] FAT-fs (loop3): bogus number of reserved sectors
[  755.513941] FAT-fs (loop3): Can't find a valid FAT filesystem
[  755.515394] ISOFS: Unable to identify CD-ROM format.
[  755.516026] UDF-fs: warning (device loop3): udf_load_vrs: No anchor found
[  755.516035] UDF-fs: Rescanning with blocksize 2048
[  755.516135] UDF-fs: warning (device loop3): udf_load_vrs: No anchor found
[  755.516143] UDF-fs: warning (device loop3): udf_fill_super: No partition found (1)

CurlyMoo commented 10 years ago

Now issue appeared when creating snapshots:


Usage:
 losetup loop_device                             give info
 losetup -a | --all                              list all used
 losetup -d | --detach <loopdev> [<loopdev> ...] delete
 losetup -f | --find                             find unused
 losetup -c | --set-capacity <loopdev>           resize
 losetup -j | --associated <file> [-o <num>]     list all associated with <file>
 losetup [options] {-f|--find|loopdev} <file>    setup

Options:
 -e, --encryption <type> enable data encryption with specified <name/num>
 -h, --help              this help
 -o, --offset <num>      start at offset <num> into file
     --sizelimit <num>   loop limited to only <num> bytes of the file
 -p, --pass-fd <num>     read passphrase from file descriptor <num>
 -r, --read-only         setup read-only loop device
     --show              print device name (with -f <file>)
 -N | --nohashpass       Do not hash the given password (Debian hashes)
 -k | --keybits <num>    specify number of bits in the hashed key given
                         to the cipher.  Some ciphers support several key
                         sizes and might be more efficient with a smaller
                         key size.  Key sizes < 128 are generally not
                         recommended
 -v, --verbose           verbose mode

Unpacking xbian-package-xbianhome (1.0.1-1) over (1.0.1) ...
Setting up xbian-package-xbianhome (1.0.1-1) ...

Usage:
 losetup loop_device                             give info
 losetup -a | --all                              list all used
 losetup -d | --detach <loopdev> [<loopdev> ...] delete
 losetup -f | --find                             find unused
 losetup -c | --set-capacity <loopdev>           resize
 losetup -j | --associated <file> [-o <num>]     list all associated with <file>
 losetup [options] {-f|--find|loopdev} <file>    setup

Options:
 -e, --encryption <type> enable data encryption with specified <name/num>
 -h, --help              this help
 -o, --offset <num>      start at offset <num> into file
     --sizelimit <num>   loop limited to only <num> bytes of the file
 -p, --pass-fd <num>     read passphrase from file descriptor <num>
 -r, --read-only         setup read-only loop device
     --show              print device name (with -f <file>)
 -N | --nohashpass       Do not hash the given password (Debian hashes)
 -k | --keybits <num>    specify number of bits in the hashed key given
                         to the cipher.  Some ciphers support several key
                         sizes and might be more efficient with a smaller
                         key size.  Key sizes < 128 are generally not
                         recommended
 -v, --verbose           verbose mode

Processing triggers for xbian-package-initramfs-tools (1.3.4-13) ...

mk01 commented 10 years ago

that's only empty call to losetup . will correct.

the above no idea as I can't reproduce it. tried NFS2, NFS3 and NFS4 was initial test. /dev/mapper/loop1p1 in the screen means file resized, partitions created and partitions recognised and mapped.

Write Protected "error" can be anything. Stale NFS lock, open file by previous process, open on server, NON existent NFS locking, or mounted with no-lock. give more information.

CurlyMoo commented 10 years ago

If you post the manual steps then i can help debugging.

mk01 commented 10 years ago

start at line 589. $2 = filename. the line is that:

chattr +CA $2 &>/dev/null;  truncate -s $size $2 || { echo "can't resize img file to needed size"; ........

starting there assumes you have created file (touch $2) and calculated size. let's use size="2G" (no conversions, truncate can use kmgKMG).

i have no anchors :(

CurlyMoo commented 10 years ago

https://github.com/xbianonpi/xbian-package-config-shell/blob/master/content/usr/local/include/xbian-config/modules/xbiancopy/dialogs#L53-L65

Shouldn't this be transifexed?

mk01 commented 10 years ago

@CurlyMoo sure for transflex but probably someone had issues with the text / length / expandable characters. don't remember.

CurlyMoo commented 10 years ago

Shouldn't this line read: echo "2048,69632,83,*," | sfdisk -u S -N1 -H 4 -S 16 -q $2 > /dev/null 2>&1 because the first imx partition will be ext2?

The error occurs here:

root@cubox:~# nice -n 10 btrfs send "/tmp/btrfs-source/$v/$s.ro" | pv -n -s $siz |eval $opt_remotecmd nice -n 10 btrfs receive "/tmp/btrfs-dest/$v"
At subvol /tmp/btrfs-source/root/@.ro
At subvol @.ro
ERROR: send ioctl failed with -5: Input/output error
0
ERROR: unexpected EOF in stream.

[  585.031254] btrfs: ERROR did not find backref in send_root. inode=622, offset=0, disk_byte=152010752 found extent=152010752

To confirm that it's not my NFS share that's bugging:

root@cubox:~# nice -n 10 btrfs send "/tmp/btrfs-source/$v/$s.ro" | pv -n -s $siz | dd of=/dev/null
At subvol /tmp/btrfs-source/root/@.ro
0
0
0
0
0
ERROR: send ioctl failed with -5: Input/output error
0
15+865 records in
224+1 records out
114944 bytes (115 kB) copied, 5.44505 s, 21.1 kB/s

[  522.794567] btrfs: ERROR did not find backref in send_root. inode=622, offset=0, disk_byte=152010752 found extent=152010752

This was also posted earlier by @IriDium: http://forum.xbian.org/thread-2100-post-22894.html#pid22894

And it keeps occuring:

root@cubox:/tmp/btrfs-source/root/@/bin# btrfs inspect-internal inode-resolve -v 622 /root
ioctl ret=0, bytes_left=4067, bytes_missing=0, cnt=1, missed=0
/root/bin/chvt
root@cubox:/tmp/btrfs-source/root/@/bin# cp chvt chvt.bak
root@cubox:/tmp/btrfs-source/root/@/bin# btrfs inspect-internal inode-resolve -v 622 /root
ioctl ret=0, bytes_left=4067, bytes_missing=0, cnt=1, missed=0
/root/bin/chvt
root@cubox:/tmp/btrfs-source/root/@/bin# rm chvt
root@cubox:/tmp/btrfs-source/root/@/bin# btrfs inspect-internal inode-resolve -v 622 /root
ioctl ret=-1, error: No such file or directory
root@cubox:/tmp/btrfs-source/root/@/bin# mv chvt.bak chvt
root@cubox:/tmp/btrfs-source/root/@/bin# btrfs inspect-internal inode-resolve -v 622 /root
ioctl ret=-1, error: No such file or directory

[  585.031254] btrfs: ERROR did not find backref in send_root. inode=622, offset=0, disk_byte=152010752 found extent=152010752
[ 1102.744330] btrfs: ERROR did not find backref in send_root. inode=623, offset=0, disk_byte=152481792 found extent=152481792
[ 1217.688658] btrfs: ERROR did not find backref in send_root. inode=624, offset=0, disk_byte=152014848 found extent=152014848

root@cubox:/# btrfs inspect-internal inode-resolve -v 624 /root
ioctl ret=0, bytes_left=4063, bytes_missing=0, cnt=1, missed=0
/root/bin/dumpkeys

Could it be that the image was corrupted somehow?

The wierd thing is that scrubbing doesn't show errors:

scrub status for dc1ce5f2-ac21-4b0a-ba6a-1db8ceaa4f71
        scrub started at Fri Jul  4 13:47:09 2014 and finished after 97 seconds
        total bytes scrubbed: 1.72GiB with 0 errors

mk01 commented 10 years ago

@CurlyMoo

t hanks for looking into that. everything is possible even let's say biterrors in image. and if you BOTH have a problem I can't replicate on 'non-that-image' installation. maybe it can be also a problem with SD (not card, but kernel/hw/anything). my installation is running from raid0 on USB hdd's almost since I have cubox.

I will clone to SD and leave it for few days. If this corruption is happening (it wasn't image), then btrfs is only FS able to report it. so testing ext4 would not help much there.

by all means reboot into rescue shell and run

btrfs check /dev/XXXX

if errors found but it looks promising run it with "--repair" again. Recently I tested it on one of "our" RPI corrupted filesystems created maybe with prehistoric btrfs in 3.9.x kernel and it booted again.

about scrub - scrubbing is just csum recalc & retest. it won't help with internal structure problems (at least this is my high level understanding (no insights there)).

mk01 commented 10 years ago

by any chance can you delegate to someone somewhere? http://ivka57.dyndns-ip.com/images/xbian-image-imx6-20140718.img.xz and just to be sure

mk@debian:~/BUILD/xbian-build-img$ md5sum build/imx6/xbian-image-imx6-20140718.img.xz
40485f7d343433a1a20786095b4766df  build/imx6/xbian-image-imx6-20140718.img.xz

RPI images I consider as ok

CurlyMoo commented 10 years ago

and just to be sure

Already mentioned by @Iridium that it's missing uboot and dtb files.

I've managed to test the btrfs check. Nothing usefull showed up.

I've also tried to fix the inode errors, but as soon as the last inodes were fixed the first reappeared again as errors. Couldn't this be a failing BTRFS kernel module?

CurlyMoo commented 10 years ago

I've tried installing the image to a usb stick and the problem persists.

mk01 commented 10 years ago

@CurlyMoo

it is not missing dtb files or uboot.

and of course everything is possible (also BAD btrfs driver) but last change on it I put into code back in March. with no changes since then. also this is my FS running from Feb.

root@cubox:/mnt/src# btrfs device stats /
[/dev/sda2].write_io_errs   0
[/dev/sda2].read_io_errs    0
[/dev/sda2].flush_io_errs   0
[/dev/sda2].corruption_errs 0
[/dev/sda2].generation_errs 0
[/dev/sdb2].write_io_errs   0
[/dev/sdb2].read_io_errs    0
[/dev/sdb2].flush_io_errs   0
[/dev/sdb2].corruption_errs 0
[/dev/sdb2].generation_errs 0
root@cubox:/mnt/src# btrfs fi show /
Label: 'btrfs-test'  uuid: 6502b376-5797-469a-84ad-f3ed18d4036b
    Total devices 2 FS bytes used 29.49GiB
    devid    1 size 465.29GiB used 17.04GiB path /dev/sda2
    devid    2 size 465.29GiB used 17.03GiB path /dev/sdb2

Btrfs v3.14.1-2-g99f2c9b
root@cubox:/mnt/src#

mk01 commented 10 years ago

i tested copy to another USB disk/stick, to .img, to NFS mounted destination.

btw: https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg31812.html

with just quick read it looks like related to kernel ver which was creating the actual fs. indeed the build machine is currently Debian kernel 3.14-1-amd64. (and fix by bacik https://git.kernel.org/cgit/linux/kernel/git/josef/btrfs-next.git/commit/?id=1334bebe71bebbca47b3b92f25511ea980fdeab8)

there is already 3.14.12 available. no idea if it has the patch included. that means you can install it, rebuild imx6 img and report.

mk01 commented 10 years ago

@CurlyMoo

looking at the patch - it is exactly the same problem we had on RPI kernel 3.10 back in Dec last year. It is part of standard 3.10.30 tree.

do a new subvolume, with cp or rsync copy all files to it. try to send | recv that new subvolume. it should work without issues.

mk01 commented 10 years ago

@CurlyMoo

looks like the new debian kernel is fixed. I tried btrfs send (btrfs send ./@ROsnap > /dev/null) on latest .img generated and whole partition was read successfully.

CurlyMoo commented 10 years ago

Do you mean an image with LTS kernel 3.14.14?

CurlyMoo commented 10 years ago

Tested it on the 18 aug image but it doesn't work.

CurlyMoo commented 10 years ago

Confirmed to work on kernel 3.14.14 with the BTRFS patches :smile:

mk01 commented 10 years ago

perfect, just told the same on other issue.

CurlyMoo commented 10 years ago

In the 31th it didn't work anymore.

CurlyMoo commented 10 years ago

I finally found the issue of xbian-copy not working on the Hummingboard. The problem is that the required loop device isn't available immediately after it's created. This means that these commands fail:

    if [ "$(xbian-arch)" = iMX6 ]; then
        mkfs.ext2 -L xbianboot /dev/mapper/${loopd}p1
    else
        mkfs.msdos -F 16 -n xbianboot /dev/mapper/${loopd}p1
    fi

    mount /dev/mapper/${loopd}p1 /tmp/btrfs-dest || exit 5

That is exactly the error messages we see.

The read error, sector 0, llseek error, llseek error is about the mkfs.ext2 command failing.
The mount: block device /dev/mapper/p1 is write protected, mounting read-only tells us that the loop device wasn't available yet.
Because the mkfs command failed, the mount command also fails a second time mount: you must specify the filesystem type.
Both failures will stop the process early and therefor no /tmp/ folders are created.

The patch is simple and efficient. Adding a wait loop after the loop device creation:

    loopd=${loopd#/dev/}

    timeout = 0;
    while ! [ -L /dev/mapper/${loopd}p1 ]; do
        timeout=$(($timeout+1))
        if [ $timeout -gt 10 ]; then
            echo "Loop device not available";
            exit 4;
        fi
        sleep 1;
    done

CurlyMoo commented 10 years ago

Also, the latest HB kernel 3.14.14+ works perfectly in regard to BTRFS send/receive. Even when a lot of disk writes are done like apt-get upgrade everything (300+ packages). I hope someone can provide an updated kernel compilation in the XBian apt so i can create a new HB image.

And when switching between 3.10.51 (in which xbian-copy doesn't work due to BTRFS) and 3.14.14+

xbianonpi / xbian

[CuBox-I] xbian-copy doesn't work #566