Open jamesruic opened 1 year ago
@105590023 Looks like same as https://github.com/elastio/elastio-snap/issues/63. Is it reproducable without reboot? Another direction. Could you test reboot with shutdown script?
root@user-vm:/home/kgermanov# cat /lib/systemd/system-shutdown/umount_rootfs.shutdown
#!/bin/sh
sync
mount -o remount,ro /
umount /
@105590023
There is another interesting case to check. Does this issue with Corruption warning
in dmesg happen with ext4
FS?
If no, @kgermanov is right. It looks like #63.
@kgermanov Thank you for your reply. I use the shutdown script, still get the error.
[root@localhost ~]# cat /lib/systemd/system-shutdown/umount_rootfs.shutdown
#!/bin/sh
sync
mount -o remount,ro /
umount /
[root@localhost ~]# dmesg
[ 45.309376] loop: module loaded
[ 45.325215] XFS (loop0): Mounting V5 Filesystem
[ 45.336186] XFS (loop0): Corruption warning: Metadata has LSN (1:2704) ahead of current LSN (1:2679). Please unmount and run xfs_repair (>= v4.3) to resolve.
[ 45.336188] XFS (loop0): log mount/recovery failed: error -22
[ 45.336208] XFS (loop0): log mount failed
@e-kov Thank you for your reply. It works fine in ext4 FS.
@jamesruic Could you retest with this steps?:
[root@localhost ~]# cat /elastio-reload
#!/bin/sh
elioctl reload-snapshot /dev/sdb1 /.snapshot0 0
[root@localhost ~]# xfs_freeze /data
[root@localhost ~]# sync
[root@localhost ~]# elioctl setup-snapshot -c 10 -f 200 /dev/sdb1 /data/.snapshot0 0
[root@localhost ~]# xfs_freeze -u /data
[root@localhost ~]# mount /dev/elastio-snap0 /test/ && sleep 1 && umount /test
[root@localhost ~]# dmesg | grep elastio
[root@localhost ~]# systemctl start reboot.target
<after reboot>
[root@localhost ~]# mount /dev/elastio-snap0 /test/
[root@localhost ~]# dmesg
@kgermanov I'm afraid elioctl setup-snapshot
will hang after xfs_freeze
, because it couldn't allocate CoW file at the frozen FS.
@e-kov yes, you are right(
@e-kov is incremental after reboot broken in general?
@anelson no. It's not broken in general. The issue is with XFS only. This is a manifestation of a problem with the mount and XFS logs #63
Discussed on planning. Scope is clear now.
This is technically a duplicate of #63, however @e-kov has asked to keep this issue open separately as it contains another useful scenario with which to validate a future fix of #63.
Is it possible to use register_reboot_notifier
to do some processing on the block device before shutting down the system?
Maybe register one notifier through register_reboot_notifier()
function when module init, and do something like: change to snapshot mode or freeze device.
I'm not sure if it will help.
https://elixir.bootlin.com/linux/v3.10/source/kernel/sys.c#L344
int shutdown_notification(struct notifier_block *nb, unsigned long action, void* unused) {
// do something
return NOTIFY_DONE;
}
static struct notifier_block reboot_notifier = {
.notifier_call = &shutdown_notification,
.priority = INT_MAX
};
int __init init_module(void) {
...
register_reboot_notifier(&reboot_notifier);
...
}
void __exit exit_module(void) {
...
unregister_reboot_notifier(&reboot_notifier);
...
}
Hi I'm testing
elioctl reload-incremental
andreload-snapshot
command with latest code, and found it has some problem. Afterelioctl reload-incremental
orreload-snapshot
, then update changes to image. This image will cause error while mounting.Here are my test steps and virtual machine information. My VM is CentOS 7.9 on VMware and its info:
Here's my test steps:
type getarg >/dev/null 2>&1 || . /lib/dracut-lib.sh
modprobe elastio-snap
[ -z "$root" ] && root=$(getarg root=) [ -z "$rootfstype" ] && rootfstype=$(getarg rootfstype=)
rbd="${root#block:}" if [ -n "$rbd" ]; then case "$rbd" in LABEL=) rbd="$(echo $rbd | sed 's,/,\x2f,g')" rbd="/dev/disk/by-label/${rbd#LABEL=}" ;; UUID=) rbd="/dev/disk/by-uuid/${rbd#UUID=}" ;; PARTLABEL=) rbd="/dev/disk/by-partlabel/${rbd#PARTLABEL=}" ;; PARTUUID=) rbd="/dev/disk/by-partuuid/${rbd#PARTUUID=}" ;; esac
fi
[root@localhost ~]# cat /elastio-reload
!/bin/sh
modprobe elastio-snap -d /etc/elastio/dla/mnt elioctl reload-incremental /dev/sdb1 /.snapshot0 0
[root@localhost ~]# elioctl setup-snapshot /dev/sdb1 /data/.snapshot0 0 [root@localhost ~]# cat /proc/elastio-snap-info { "version": "0.11.0", "devices": [ { "minor": 0, "cow_file": "/.snapshot0", "block_device": "/dev/sdb1", "max_cache": 314572800, "fallocate": 213909504, "seq_id": 1, "uuid": "ae776b8c35124ea4b9eeeb8cebbb8034", "version": 1, "nr_changed_blocks": 0, "state": 3 } ] }
[root@localhost ~]# dd if=/dev/elastio-snap0 of=/mnt/mydisk bs=4M 511+1 records in 511+1 records out 2145386496 bytes (2.1 GB) copied, 4.23293 s, 507 MB/s
[root@localhost ~]# elioctl transition-to-incremental 0 [root@localhost ~]# cat /proc/elastio-snap-info { "version": "0.11.0", "devices": [ { "minor": 0, "cow_file": "/.snapshot0", "block_device": "/dev/sdb1", "max_cache": 314572800, "fallocate": 213909504, "seq_id": 1, "uuid": "ae776b8c35124ea4b9eeeb8cebbb8034", "version": 1, "nr_changed_blocks": 9, "state": 2 } ] }
[root@localhost ~]# reboot
[root@localhost ~]# cat /proc/elastio-snap-info { "version": "0.11.0", "devices": [ { "minor": 0, "cow_file": "/.snapshot0", "block_device": "/dev/sdb1", "max_cache": 314572800, "fallocate": 213909504, "seq_id": 1, "uuid": "ae776b8c35124ea4b9eeeb8cebbb8034", "version": 1, "nr_changed_blocks": 9, "state": 2 } ] }
[root@localhost ~]# touch /data/tempfile2 [root@localhost ~]# ls -la /data/ total 4104 drwxr-xr-x. 2 root root 57 Nov 16 14:40 . dr-xr-xr-x. 18 root root 277 Nov 16 14:34 .. ----------. 1 root root 4198400 Nov 16 14:38 .snapshot0 -rw-r--r--. 1 root root 6 Nov 16 14:25 tempfile -rw-r--r--. 1 root root 0 Nov 16 14:40 tempfile2
[root@localhost ~]# elioctl transition-to-snapshot /.snapshot1 0 [root@localhost ~]# update-img /dev/elastio-snap0 /data/.snapshot0 /mnt/mydisk snapshot is 523776 blocks large copying blocks copying complete: 13 blocks changed, 0 errors
[root@localhost2 ~]# mount /mnt/mydisk /test/ mount: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error
[root@localhost2 ~]# dmesg [ 32.873553] XFS (loop0): Mounting V5 Filesystem [ 32.884892] XFS (loop0): Corruption warning: Metadata has LSN (1:2303) ahead of current LSN (1:2271). Please unmount and run xfs_repair (>= v4.3) to resolve. [ 32.884894] XFS (loop0): log mount/recovery failed: error -22 [ 32.884918] XFS (loop0): log mount failed