Open CoRpO opened 11 years ago
seems like the cache was created in read only mode and edited to write back mode followed by a reboot Since the udev rule was file not updated, upon reboot enhanceio would attempt ot recreate cache in readonly mode and thus fail.
The old udev script did not update udev rule file on cache edit This was a bug in the udev script, it has been fixed in 6b09be22bc42722a34193041c447e169382f08a8
Okay for the udev part, but what about the fsck kicking in before eio starts the caching device ?
This is a bug If the source device comes up and ssd has not yet comes up, we set the source device to read only to prevent corruption of source device. We need to reset the mode to read write once the ssd has come up (this is missing)
The scenario in your log seems to be :- 1) source device comes up ---> blockdev--setro changes it to read only mode 2) device is mounted ( fsck happens since the cache had dirty data) 3) ssd comes up and cache create is fired.
As you said, we need to add -noauto flag in fstab to prevent auto mount of our source device before cache has come up
Thanks
One workaround for me is that use a /boot partition and write init scripts in your initrd that enable (eio_cli enable) the cache before the device get mounted, i have my / (root) partition with writeback mode, and this works for me (/boot and / partitions formated with ext4)
In my particular case (Archlinux) i create a new enhanceio hook for initrd, that enable the cache after the block devices are detected BUT before the partitions get mounted. i use fsck in my initrd too. the hook is very heavy because i need to put all the python interpreter in the initrd with the blockdev tool, but works.
There's got to be a safeguard against filesystem corruption. Blaming udev for that is lame. Using software package should not be like navigating the minefield.
A couple of things can be done:
I will try to see if I can add some code to help with this one of these days.
Hi,
While filling the server in writeback mode, and serving a few files, kernel panic occured. I don't know if it is related to eio or not (no logs), but the recovery after reboot failed.
The filesystem (on /home) was auto-checked before eio cache was up, leading to a corrupted FS and unmountable /home. Worse, the cache didn't come up as it tried to mount as readonly but there was dirty data.
Re-enabled the cache and mounted the FS to allow log replay
after xfs_repair, several files were damaged. Flushed the cache (put in readonly mode, flushing 300 GB of dirty data) and disabled write barrier mount option of /home
EnhanceIO should bring cache up before fsck can happen. Or maybe it can be circumvented with a fstab option which should be documented.
Thanks