Closed e-kov closed 1 year ago
Looks like as race condition between update inode and write to inode. Tests are not failed, because this is as warning Also sometimes there are error on writes (if exists writes at transitional moment):
root@user-vm:/home/kgermanov/work# git log -n 1
commit b42b83074fc79a1ab15f2c725192b4581a7edc7e (HEAD -> elastio_master, elastio/master)
root@user-vm:/home/kgermanov/work# uname -a
Linux user-vm 5.15.0-56-generic #62~20.04.1-Ubuntu SMP Tue Nov 22 21:24:20 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
root@user-vm:/home/kgermanov/work# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.1 LTS
Release: 20.04
Codename: focal
root@user-vm:/home/kgermanov/work# mount -l | grep sda5
/dev/sda5 on / type ext4 (rw,relatime,errors=remount-ro)
root@user-vm:/home/kgermanov/work# elioctl setup-snapshot -c 300 -f 204 /dev/sda5 /.eliosnap/test_snap1 3
root@user-vm:/home/kgermanov/work# ls -la /.eliosnap/
---------- 1 root root 213909504 Jan 12 14:09 test_snap1
root@user-vm:/home/kgermanov/work/agent/bdsnap# elioctl transition-to-incremental 3
root@user-vm:/home/kgermanov/work/agent/bdsnap# ls -la /.eliosnap/
---------- 1 root root 145756160 Jan 12 14:09 test_snap1
root@user-vm:/home/kgermanov/work# cat /proc/elastio-snap-info
{
"version": "0.12.0",
"devices": [
{
"minor": 3,
"cow_file": "/.eliosnap/test_snap1",
"block_device": "/dev/sda5",
"max_cache": 300,
"fallocate": 213909504,
"seq_id": 1,
"uuid": "7ecc46bd131f4982a8185d330efe5e67",
"version": 1,
"nr_changed_blocks": 40,
"error": -1, //!!!!!
"state": 2,
"ignore_errors": 0
}
]
}
root@user-vm:/home/kgermanov/work# dmesg -c
[2514916.618240] elastio-snap: ioctl command received: 1074020613
[2514916.618244] elastio-snap: received transition inc ioctl - 3
[2514916.618247] elastio-snap: allocating device struct
[2514916.618248] elastio-snap: initializing tracer
[2514916.618249] elastio-snap: creating kernel cow thread
[2514916.618390] elastio-snap: getting the base block device's make_request_fn
[2514916.618393] elastio-snap: original mrf is already replaced with the tracing_mrf = 00000000a7dc8bde
[2514916.618397] elastio-snap: found already traced device sda5 with the same original bd_ops. orig mrf = 00000000d8fb8b26; orig ops = 00000000a8d9c0f5; tracing ops = 00000000c19dbeaa
[2514916.618401] elastio-snap: using already existing tracing ops for device with minor 3
[2514916.618407] elastio-snap: freezing 'sda5'
[2514916.874849] elastio-snap: starting tracing
[2514916.874853] elastio-snap: thawing 'sda5'
[2514916.876026] elastio-snap: stopping cow thread
[2514916.878064] elastio-snap: error performing truncation: -1
[2514916.878063] elastio-snap: error performing file 'write': 4096, 32768: -1
[2514916.878071] elastio-snap: error writing cow manager section to file: -1
[2514916.878073] elastio-snap: error writing cow manager section 0 to file: -1
[2514916.878076] elastio-snap: error cleaning cow manager mappings: -1
[2514916.878078] elastio-snap: error writing cow mapping: -1
[2514916.878080] elastio-snap: error handling sset: -1
[2514916.878082] elastio-snap: error handling sector set in kernel thread: -1
[2514916.878094] elastio-snap: error truncating file: -1
[2514916.878105] elastio-snap: warning: failed to truncate '/.eliosnap/test_snap1', incremental will use more disk space than needed
[2514916.878109] elastio-snap: stopping mrf thread
[2514916.878164] elastio-snap: freeing gendisk
[2514916.897196] elastio-snap: freeing request queue
[2514916.897358] elastio-snap: minor range = 3 - 3
Interesting, that truncate realy executed, but error return in notify_change I tried wrap file_unlock/file_lock this place: https://github.com/elastio/elastio-snap/blob/master/src/elastio-snap.c#L1827 But without success
There is an issue with the CoW file truncation after the call
transition to incremental
:It was reproduced by the tests execution
sudo ./elio-test.sh
in the PopOS VM.Tests output doesn't contain any error:
dmesg.log
And there are no such errors in dmesg if all occurrences of
file_lock(filp);
,file_unlock(filp);
,file_unlock_mark_dirty(filp);
are commented insrc/elastio-snap.c
.The issue have seen just on PopOS and I'm not sure, will we see it on other distros. Thus it's not a high priority until it's seen at least on one other distro.