stec-inc / EnhanceIO

EnhanceIO Open Source for Linux
Other
420 stars 176 forks source link

File corruption on reboot – late udev aggregation? #75

Open hryamzik opened 10 years ago

hryamzik commented 10 years ago

Tried both on ubuntu 14.04 and 12.04.

Here's the install log:

sudo aptitude install build-essential checkinstall git dkms
git clone https://github.com/stec-inc/EnhanceIO.git
cd EnhanceIO
sudo cp -v CLI/eio_cli /sbin/
sudo chmod 700 CLI/eio_cli
sudo cp -v CLI/eio_cli.8 /usr/share/man/man8
cd Driver/enhanceio/
version=$(git log --first-parent master --oneline|wc -l)
echo $version

sudo checkinstall -D --pkgname enhanceio --pakdir ~/ --pkgversion $version -y

For 12.04 I've had to manually add modules load:

sudo modprobe -v enhanceio
sudo modprobe -v enhanceio_fifo
sudo modprobe -v enhanceio_lru
grep enhanceio /etc/modules || sudo tee -a /etc/modules <<EOF
enhanceio
enhanceio_fifo
enhanceio_lru
EOF
sudo cp -vr enhanceio /usr/src
sudo mv -v /usr/src/enhanceio /usr/src/enhanceio-$version
sudo dkms add     -m enhanceio -v $version
sudo dkms build   -m enhanceio -v $version
sudo dkms install -m enhanceio -v $version
hdd=/dev/sdb
ssd=/dev/sdc
sudo eio_cli create -d $hdd -s $ssd -m wb -c enhanceio0

And here's the test:

dd if=/dev/urandom bs=1024 count=$((1024*512)) | pv -perbts 512M >~/removeme
md5sum ~/removeme |tee test.txt
sudo touch /storage/removeme && sudo chmod 777 /storage/removeme
dd if=~/removeme | pv -perbts 512M >      /storage/removeme && sudo reboot

After reboot I check the md5:

md5sum /storage/removeme |tee test2.txt && diff -u test.txt test2.txt

And it turns out that files differ.

I've also tried to convert udev rules to a shell script and run it in init script:

sudo tee /etc/init/enhanceio.conf <<EOF
# enhanceio - enhanceio enabler
#

description     "Reenable writeback cached drives"

start on starting mountall

task

script
    /opt/eio.sh
end script
EOF

tee /opt/eio.sh <<EOF
#!/bin/sh -x
hdd=sda2
ssd=sdb

ssd_name=\$ssd
disk_name=\$hdd

kernel=\$ssd_name
/bin/mkdir -p /dev/enhanceio/enhanceio0
echo \$kernel > /dev/enhanceio/enhanceio0/.ssd_name
/sbin/eio_cli notify -a add -s /dev/\$kernel -c enhanceio0
# We just found the source device and the cache already exists then we can setup
kernel=\$disk_name
echo \$kernel > /dev/enhanceio/enhanceio0/.disk_name
magorminor=\$(udevadm info --device-id-of-file /dev/\$hdd|tr ':' ' ')
echo \$mmagorminor > /dev/enhanceio/enhanceio0/.disk_mjr_mnr
echo \$links > /dev/enhanceio/enhanceio0/.srclinks
/sbin/eio_cli notify -a add -d /dev/\$kernel -c enhanceio0
if test ! -e /proc/enhanceio/enhanceio0
    then
    if test ! -e /dev/enhanceio/enhanceio0/.skip_enable
        then
        /bin/mknod /dev/\$disk_name b \$magorminor
    fi
fi
if test ! -e /proc/enhanceio/enhanceio0
    then
    for i in \`cat /dev/enhanceio/enhanceio0/.srclinks\`; do rm -f /dev/\$\$i; ln -f -s /dev/\$disk_name /dev/\$\$i; done
fi
if test ! -e /proc/enhanceio/enhanceio0
    then
    /sbin/eio_cli enable -d /dev/\$disk_name -s /dev/\$ssd_name -m wt -b 4096 -p lru -c enhanceio0
fi

eio_cli info
ls /dev/enhanceio/enhanceio0
EOF
chmod +x /opt/eio.sh

Didn't get any luck. And it's not even a root device! Looks very confusing.

bhansaliakhil commented 10 years ago

Hi,

1. From the details you have provided, it is not sure if your cache was created after reboot.

As you said earlier that you have to manually insert the modules, could you please verify before running md5sum after reboot; if enhanceio modules were loaded and cache was created successfully.

  1. Here I am assuming you have /storage/removeme file is on your /dev/sdb device which you are caching. Could you please share details of /etc/fstab entry to check file system details of your host?

I hope this will give some clue to debug it further. Thanks.

hryamzik commented 10 years ago

Sure, here's the lsmod output:

# lsmod                 |grep -i enhanceio
enhanceio_lru          12922  1 
enhanceio_fifo         12809  0 
enhanceio             162345  2 enhanceio_lru,enhanceio_fifo

And the fstab content:

# cat /etc/fstab
# /etc/fstab: static file system information.
#
# Use 'blkid' to print the universally unique identifier for a
# device; this may be used with UUID= as a more robust way to name devices
# that works even if disks are added and removed. See fstab(5).
#
# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    nodev,noexec,nosuid 0       0
# / was on /dev/sda1 during installation
UUID=ed655083-cf57-495f-a071-85e7e9ad2430 /               ext4    errors=remount-ro 0       1
# swap was on /dev/sda5 during installation
UUID=3be0c2bd-940c-4ee9-8e67-1fd270d5a41a none            swap    sw              0       0
/dev/vg_big/storage  /storage  ext4  defaults 0 0

Modules are loaded as I've manually put them to /etc/modules, this code is listed in the initial issue post.

PS: I've mostly played with 14.04 but this server is in production now. I've found out that udev rules are loaded way after filesystems are mount.

bhansaliakhil commented 10 years ago

Hey,

Could you please as well give me output for "eio_cli info" command as well? B'coz even if modules are loaded, it does not mean cache is also created and active.

hryamzik commented 10 years ago

Absolutely.

root@enhanceio:~# eio_cli info
Cache Name       : enhanceio0
Source Device    : /dev/sdb
SSD Device       : /dev/sdc
Policy           : lru
Mode             : Write Back
Block Size       : 4096
Associativity    : 256
State            : normal

For more information look at /proc/enhanceio/<cache_name>/config

In fact I test this in a VM running in on external USB HDD with a disk attached from my internal ssd. It seems to be easy to reproduce.

bhansaliakhil commented 10 years ago

Looks interesting! No clue till this point!! Hmm ... Good idea for me would be to reproduce it in house! Let me try and get back to you on this! Thanks for reporting though!

Have a happy weekend! :)

hryamzik commented 10 years ago

Thanks. Here're the commands I've used to configure the bare installation after attaching two disks (HDD and SSD):

sudo aptitude install lvm2
hdd=/dev/sdb
ssd=/dev/sdc

sudo vgcreate vg_big $hdd
sudo lvcreate -v -L2G -nstorage vg_big
sudo mkfs.ext4 /dev/vg_big/storage

sudo mkdir -vp /storage
grep "/dev/vg_big/storage" /etc/fstab || sudo tee -a /etc/fstab <<EOF
/dev/vg_big/storage  /storage  ext4  defaults 0 0
EOF
sudo mount -a
ksperis commented 10 years ago

hi hryamzik, If the problem was that the cache was not active before mount, I have proposed a patch some time ago to add a symlink in udev rules. (https://github.com/ksperis/EnhanceIO/commit/954e167fdb580d514747512ce2bd1c9c29a77418) This allow to be sure that cache (especially in wb) has been loaded before fstab mount: ie with something like this in fstab : /dev/eio-enhanceio0 /storage ext4 defaults 0 0

hryamzik commented 10 years ago

I'll give it a try this week and will let you know. Thanks! UPD: cant check right now, blocked by #80.

elmystico commented 8 years ago

I've just fixed this in eio_cli with some python code and udev and systemd fixes. I'll send patches soon.