Closed e-kov closed 2 years ago
@anelson @vsazhenyuk-softheme since your last review:
elio-test.sh
when using external devices for the LVM/raid scenarios. And a few following issues are fixed too.arm64
due to several timeout errors. (the storage in ec2 is slow comparatively with the physical servers).
Fixed kernel panic on snapshot destroy for a partition
The kernel panic was happening in case if there are multiple snapshot devices are for the multiple partitions of the same disk. The problem was for the kernels 5.9+.
There is a logic for these kernels to replace
block_device_operations
structure with the driver's tracing function instead of an originalsubmit_bio
. This struct belongs to the disk, and it's shared between partitions of the disk.The issue was in the access to the freed memory after ours tracing struct was freed.
Now we are not allocating a new struct when setting up a snapshot for the 2nd+ partition of the disk. And this struct is freed just when the last snapshot for some partition of the disk has been destroyed.
The driver has the same behavior with the
make_request
function in the bio queue for the kernel versions before 5.9. It's replaced with the tracking function on the first setup snapshot operation for multiple partitions of the disk. And an originalmake_request
is set back when the last snapshot device has been destroyed respectively. So, now this behaviour is consistent for all Linux kernel versions.Fixes https://github.com/elastio/elastio-snap/issues/155
Implemented new tests for snaps of partitions of the same disk
The idea is to add a test on the loopback device with 2 or more partitions. This test should reproduce the bug #155 like this: 1) create snapshots for both partitions of the disk (loopback device); 2) destroy 2nd snapshot device; 3) perform write to the 2nd partition; 4) wait a bit.
These steps are leading to the kernel panic without the fix.
Also added 2 other tests: simple setup test and a test with writes to all partitions.
Fixed tests if they are running on machine with '/' on LVM
There is a test
test_setup_2_volumes
which uses one synthetic device and 2nd is a root volume of the host machine. This test was failing due to the multiple and different device names for LVM devices like/dev/mapper/ubuntu--vg-ubuntu--lv
and/dev/dm-0
which are the same device.