lanconnected / EnhanceIO

EnhanceIO Open Source for Linux
Other
101 stars 31 forks source link

EnhanceIO vs. zram #14

Closed marcin-github closed 7 years ago

marcin-github commented 7 years ago

I'm trying to use enhanceio with zram. It looks that enhanceio can't write to block device zram. In dmesg it looks as below: Mar 29 09:03:05 localhost kernel: [ 107.807802] zram: Added device: zram0 Mar 29 09:03:31 localhost kernel: [ 133.503113] zram0: detected capacity change from 0 to 300003328 Mar 29 09:03:53 localhost kernel: [ 155.963158] register_policy: policy 1 added Mar 29 09:03:53 localhost kernel: [ 155.968492] register_policy: policy 2 added Mar 29 09:03:53 localhost kernel: [ 155.973663] register_policy: policy 3 added Mar 29 09:03:53 localhost kernel: [ 156.002035] enhanceio: Setting mode to write through Mar 29 09:03:53 localhost kernel: [ 156.002039] get_policy: policy 2 found Mar 29 09:03:53 localhost kernel: [ 156.002042] enhanceio_lru: eio_lru_instance_init: created new instance of LRU Mar 29 09:03:53 localhost kernel: [ 156.002044] enhanceio: Setting replacement policy to lru (2) Mar 29 09:03:53 localhost kernel: [ 156.002058] Not enough sets to use small metadata Mar 29 09:03:53 localhost kernel: [ 156.002061] enhanceio: Allocate 4432KB (8B per) mem for 567296-entry cache (capacity:285MB, associativity:256, block size:512 bytes) Mar 29 09:03:53 localhost kernel: [ 156.015349] enhanceio_lru: Initialized 2216 sets in LRU Mar 29 09:04:15 localhost kernel: [ 177.677515] io_callback: io error -5 block 0 action 5 Mar 29 09:04:15 localhost kernel: [ 177.677520] io_callback: io error -5 block 1 action 5 [flood messages io_callback]

I suspect that it can be somehow related to block size, enhanceio logged "block size:512 bytes" but zram has size 4k: # blockdev --report /dev/zram0 RO RA SSZ BSZ StartSec Size Device rw 256 4096 4096 0 300003328 /dev/zram0

lanconnected commented 7 years ago

Hi! Could you please provide your kernel version and exact steps leading to the problem (zram, cache creation etc.)? Which enhanceio version (commit hash) are you testing? Thanks.

marcin-github commented 7 years ago

Hi, I'm suprised that you are willing to look into this problem:) I'm using kernel 4.9. Steps are:

Udev rule isn't created and this is good in this case. Next I'm doing some traffing on filesystem. After this I'm getting: # cat /proc/enhanceio/portage_zram/errors disk_read_errors 0 disk_write_errors 0 ssd_read_errors 0 ssd_write_errors 5304 memory_alloc_errors 0 no_cache_dev 0 no_source_dev 0

git tree is on: commit 02410b150b1f51287e96db2a5d9f723c79473ec2 Author: Evzen Demcenko demcenko@cldn.eu Date: Wed Mar 8 16:10:00 2017 +0100

Code cleanup
lanconnected commented 7 years ago

Initial support is added in latest commits, please give it a try.

marcin-github commented 7 years ago

Thank you for your effort. Should I try on 4.9 or 4.10? On 4.10 I've got:

# ./eio_cli create -d /dev/sdd1 -s /dev/zram0 -c sdd1_zram Cache Name : sdd1_zram Source Device : /dev/sdd1 SSD Device : /dev/zram0 Policy : lru Mode : Write Through Block Size : 4096 Associativity : 256 Cache creation failed (dmesg can provide you more info)

[ 3531.490055] enhanceio: Setting mode to write through [ 3531.490059] get_policy: policy 2 found [ 3531.490108] enhanceio_lru: eio_lru_instance_init: created new instance of LRU [ 3531.490135] enhanceio: Setting replacement policy to lru (2) [ 3531.490183] enhanceio: Cache creation failed: Invalid cache size or can't be fetched.

lanconnected commented 7 years ago

Well, works for me on 4.9 and 4.11: modprobe zram echo 1000000000 > /sys/block/zram0/disksize eio_cli create -d /dev/mapper/VG0-test1 -s /dev/zram0 -c cache0 Cache Name : cache0 Source Device : /dev/mapper/VG0-test1 SSD Device : /dev/zram0 Policy : lru Mode : Write Through Block Size : 4096 Associativity : 256 ENV{DM_UUID}=="LVM-7KudBdhon0FJoRngZKFw0Qyh7E31QGhAXmAcZUML6K9uhomVX7G2ilcTLHPAhu0Y", ENV{DEVTYPE}=="disk" None Creation of udev rules file failed

expected a string or other character buffer object Cache created successfully

Could you please provide exact commands leading to problem? Exact kernel version is welcomed, too. Also, make sure you are using correct eio_cli. Thanks.

marcin-github commented 7 years ago

Argh. I probably forgot to set disksiza of zram :|

Could you try if enhanceio works with zram and freshly formatted ext4? I created cache, mode WB. Next I run mkfs.ext4 /dev/sdd1, mounted ext4 and then tried to cpy files on filesystem, The result is:

[ 4590.794283] EXT4-fs (sdd1): mounted filesystem with ordered data mode. Opts: (null) [ 4604.071399] ------------[ cut here ]------------ [ 4604.071442] kernel BUG at /usr/src/lanconnected-EnhanceIO/Driver/enhanceio/eio_main.c:2457! [ 4604.071491] invalid opcode: 0000 [#1] SMP [ 4604.071516] Modules linked in: enhanceio_rand(O) enhanceio_lru(O) enhanceio_fifo(O) enhanceio(O) zram tun netconsole configfs cpufreq_ondemand msr af_packet bridge stp llc fuse xfs zlib_deflate coretemp hwmon r8188eu(C) kvm_intel cfg80211 kvm pcrypt snd_pcsp snd_pcm i2c_i801 snd_timer snd nf_nat_ftp nf_conntrack_ftp soundcore nf_nat i2c_core irqbypass nf_conntrack rfkill hpsa sky2 scsi_transport_sas ipv6 button acpi_cpufreq raid1 md_mod pata_jmicron ehci_pci ehci_hcd dm_mod [ 4604.071726] CPU: 0 PID: 9326 Comm: ext4lazyinit Tainted: G C O 4.10.0+ #17 [ 4604.071772] Hardware name: Gigabyte Technology Co., Ltd. 965P-DS3/965P-DS3, BIOS F14 06/25/2009 [ 4604.071821] task: ffff880060a00000 task.stack: ffffc90000ac8000 [ 4604.071855] RIP: 0010:eio_map+0x2ab/0x15b0 [enhanceio] [ 4604.071883] RSP: 0018:ffffc90000acbb98 EFLAGS: 00010202 [ 4604.071912] RAX: 00000000000002c0 RBX: ffff880008456d00 RCX: 0000000000000001 [ 4604.071943] RDX: ffff880008456d00 RSI: 0000000000000001 RDI: ffff88008afdd000 [ 4604.071975] RBP: ffffc90000acbc38 R08: 00000000000000a8 R09: 0000000000000001 [ 4604.072006] R10: ffff880080541200 R11: ffff88009a4c9b80 R12: ffffffff81297c90 [ 4604.072038] R13: ffff88008afdd000 R14: ffffffffa05b3000 R15: ffff88009bc2ba80 [ 4604.072069] FS: 0000000000000000(0000) GS:ffff88009fc00000(0000) knlGS:0000000000000000 [ 4604.072117] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4604.072146] CR2: 00007fbc5c181398 CR3: 00000000195e6000 CR4: 00000000000006f0 [ 4604.072176] Call Trace: [ 4604.072204] ? eio_io_callback+0x42/0x50 [enhanceio] [ 4604.072234] ? eio_dec_count+0x51/0x60 [enhanceio] [ 4604.072264] ? eio_endio+0x24/0x30 [enhanceio] [ 4604.072294] ? blk_flush_plug_list+0x230/0x230 [ 4604.072323] eio_make_request_fn+0x34a/0x470 [enhanceio] [ 4604.072353] ? mempool_alloc_slab+0x10/0x20 [ 4604.072380] generic_make_request+0xcb/0x1a0 [ 4604.072407] submit_bio+0x6e/0x150 [ 4604.080126] next_bio+0x33/0x40 [ 4604.080126] blkdev_issue_zeroout+0x11c/0x1f0 [ 4604.080126] blkdev_issue_zeroout+0xe0/0x140 [ 4604.080126] ? ext4_journal_get_write_access+0x36/0x80 [ 4604.080126] ext4_init_inode_table+0x151/0x340 [ 4604.080126] ext4_lazyinit_thread+0x29d/0x3c0 [ 4604.080126] kthread+0xfc/0x130 [ 4604.080126] ? ext4_unregister_li_request+0x60/0x60 [ 4604.080126] ? kthread_park+0x90/0x90 [ 4604.080126] ret_from_fork+0x29/0x40 [ 4604.080126] Code: b8 45 85 ed 0f 85 aa 04 00 00 f0 48 ff 83 b0 01 00 00 48 8b 75 c0 41 b8 01 00 00 00 4c 89 f1 31 d2 48 89 df e8 97 fa ff ff eb a3 <0f> 0b 48 8b 45 c0 8b 93 04 01 00 00 8b 40 28 48 c1 e8 09 48 39 [ 4604.080126] RIP: eio_map+0x2ab/0x15b0 [enhanceio] RSP: ffffc90000acbb98 [ 4604.083642] ---[ end trace ba3c6feb8a6f458f ]--- [ 4604.083702] ------------[ cut here ]------------ [ 4604.083762] WARNING: CPU: 0 PID: 9326 at kernel/exit.c:746 do_exit+0x46/0xae0 [ 4604.083821] Modules linked in: enhanceio_rand(O) enhanceio_lru(O) enhanceio_fifo(O) enhanceio(O) zram tun netconsole configfs cpufreq_ondemand msr af_packet bridge stp llc fuse xfs zlib_deflate coretemp hwmon r8188eu(C) kvm_intel cfg80211 kvm pcrypt snd_pcsp snd_pcm i2c_i801 snd_timer snd nf_nat_ftp nf_conntrack_ftp soundcore nf_nat i2c_core irqbypass nf_conntrack rfkill hpsa sky2 scsi_transport_sas ipv6 button acpi_cpufreq raid1 md_mod pata_jmicron ehci_pci ehci_hcd dm_mod [ 4604.084055] CPU: 0 PID: 9326 Comm: ext4lazyinit Tainted: G D C O 4.10.0+ #17 [ 4604.084131] Hardware name: Gigabyte Technology Co., Ltd. 965P-DS3/965P-DS3, BIOS F14 06/25/2009 [ 4604.084206] Call Trace: [ 4604.084260] dump_stack+0x4d/0x65 [ 4604.084313] __warn+0xc6/0xe0 [ 4604.084366] warn_slowpath_null+0x18/0x20 [ 4604.084420] do_exit+0x46/0xae0 [ 4604.084472] ? kthread+0xfc/0x130 [ 4604.084526] ? ext4_unregister_li_request+0x60/0x60 [ 4604.084582] ? kthread_park+0x90/0x90 [ 4604.084635] rewind_stack_do_exit+0x17/0x20 [ 4604.084690] ---[ end trace ba3c6feb8a6f4590 ]---

lanconnected commented 7 years ago

ext4 worked OK for me, but i never used cache over partitions. You probably hit some unalignment here. I'll have a look at it, meanwhile, you could play with zram cache over full drives or LVs. Thanks for reporting. Could you please provide output of #sfdisk -d /dev/sdd ?

marcin-github commented 7 years ago

Sure, here it is:

# sfdisk -d /dev/sdd label: gpt label-id: 2DB65DB2-C105-4626-95B5-87384EDF379F device: /dev/sdd unit: sectors first-lba: 34 last-lba: 2930277134

/dev/sdd1 : start= 2048, size= 2930275087, type=0FC63DAF-8483-4772-8E79-3D69D8477DE4, uuid=6B888E59-CA4D-42A5-AD6B-0965E1F2481D

lanconnected commented 7 years ago

Might work now, try commit 94c5a30

marcin-github commented 7 years ago

I didn't expect this bug need so much effort to fix. Thank you First, quick test shows that it ok but I'm flooded with messages: dispatch_io: processing unaligned I/O: sector 22240, count 3

Can I comment out line pr_info("dispatch_io: processing unaligned I/O: sector %lu, count %lu" .... and do more tests?

lanconnected commented 7 years ago

I moved this to pr_debug, please try commit 1a79697 . It should have less bugs, too :)

marcin-github commented 7 years ago

After a couple of using enhanceio stats are:

reads 456254960 writes 504883380 read_hits 201914256 read_hit_pct 44 write_hits 128313236 write_hit_pct 25 dirty_write_hits 22947 dirty_write_hit_pct 0 cached_blocks 101888 rd_replace 31526306 wr_replace 45940857 noroom 471036 cleanings 0 md_write_dirty 59926024 md_write_clean 0 md_ssd_writes 30133709 do_clean 0 nr_blocks 101888 nr_dirty 23581 nr_sets 398 clean_index 0 uncached_reads 25132867 uncached_writes 2517717 uncached_map_size 0 uncached_map_uncacheable 0 disk_reads 254340704 disk_writes 504512468 ssd_reads 681165896 ssd_writes 989935316 ssd_readfills 252556216 ssd_readfill_unplugs 19882395 readdisk 25132867 writedisk 25132867 readcache 25243294 readfill 31569527 writecache 93708427 readcount 41104510 writecount 48402869 kb_reads 228127480 kb_writes 252441690 rdtime_ms 172388820 wrtime_ms 71006080 unaligned_ios 200443

and I didn't have data corruption. Thank you a lot for you work!

lanconnected commented 7 years ago

Hi! Thanks for testing!