Open GoogleCodeExporter opened 9 years ago
I hit this too, changing line 262 of zram_drv.c from:
cmem = zs_map_object(zram->mem_pool, handle, ZS_MM_RO);
to:
cmem = zs_map_object(zram->mem_pool, handle, ZS_MM_RW);
Seems to have fixed it. I'm testing now, and will submit a patch to the
maintainers if it works. Note, I don't actually know that this is the correct
fix, but it does solve my logfile spam.
Original comment by paer...@gmail.com
on 2 Oct 2012 at 5:22
Nope, didn't solve it, sorry.
Original comment by paer...@gmail.com
on 2 Oct 2012 at 6:37
Just as a continued FYI on this bug, here's my procedure to reproduce:
In initrd or in a booted no-initrd system, configure a few zram devices:
echo $((128 * 1024 * 1024)) > /sys/block/zram0/disksize
echo $((10 * 1024 * 1024 * 1024)) > /sys/block/zram1/disksize
echo $((1 * 1024 * 1024 * 1024)) > /sys/block/zram2/disksize
mkfs.ext4 -O dir_nlink,extent,extra_isize,flex_bg,^has_journal,uninit_bg -m0 -b
4096 -L "zram0" /dev/zram0
mkfs.ext4 -O dir_nlink,extent,extra_isize,flex_bg,^has_journal,uninit_bg -m0 -b
4096 -L "zram1" /dev/zram1
mkfs.ext4 -O dir_nlink,extent,extra_isize,flex_bg,^has_journal,uninit_bg -m0 -b
4096 -L "zram2" /dev/zram2
Then mount the filesystem:
cd /mnt
mount /dev/zram1 floppy
cd floppy
Finally crank some I/O:
dd if=/dev/urandom of=a count=1000000
This produces output similar to this in dmesg: (after a few secs)
[ 6170.383170] zram: Error allocating memory for compressed page: 53091, size=41
16
[ 6170.383171] Buffer I/O error on device zram1, logical block 53091
.....<snip 29 similar errors>.....
[ 6170.383216] Buffer I/O error on device zram1, logical block 53121
[ 6170.383219] EXT4-fs warning (device zram1): ext4_end_bio:250: I/O error
writing to inode 12 (offset 74854400 size 131072 starting block 53091)
Original comment by paer...@gmail.com
on 3 Oct 2012 at 1:15
Blarg, hate to comment spam...
It only happens if the I/O after compression ends up needing a block larger
than 4k: By lowering my bs (binary search between 4096 and 4030), around 4072
bytes of random data may or may not be able to compress enough to not trigger
the error. Did, perhaps, upstream ratelimit high order memory allocations?
All of the errors are prefaced with:
[ 7181.454451] zram: Error allocating memory for compressed page: 34832,
size=4097
Having a size >4k
Original comment by paer...@gmail.com
on 3 Oct 2012 at 1:32
I got same issue here.
I think it's because this patch: [PATCH] zram: remove special handle of
uncompressed page
https://lkml.org/lkml/2012/6/8/116
point 3 => zsmalloc can't handle bigger size than PAGE_SIZE so zram can't do it
any more without redesign
it remove the code for handle the bigger size than PAGE_SIZE (compare to kernel
3.5).
Original comment by wu.to...@gmail.com
on 3 Oct 2012 at 3:55
Thanks Wu, that patch is indeed the root cause. It tries to use zsmalloc even
for sizes > PAGE_SIZE which is not allowed. I will fix it soon.
Original comment by nitingupta910@gmail.com
on 3 Oct 2012 at 4:57
Can you please try the patch attached?
Original comment by nitingupta910@gmail.com
on 5 Oct 2012 at 5:13
Attachments:
The patch work fine for me. Thanks.
Original comment by wu.to...@gmail.com
on 5 Oct 2012 at 3:33
No more errors it seems after applying the patch to 3.6.1.
Original comment by mich...@zugelder.org
on 7 Oct 2012 at 9:40
Any plans for pushing this upstream? I figured it would have shown up in
either linus' tree or gregkh's stable tree by now.
Original comment by paer...@gmail.com
on 10 Oct 2012 at 4:07
@paerley: I have sent it to lkml for review and should be merged sometime soon.
Original comment by nitingupta910@gmail.com
on 11 Oct 2012 at 12:49
Should be merged into staging soon (sent patch gregkh). Closing the issue.
Original comment by nitingupta910@gmail.com
on 11 Oct 2012 at 6:47
i applied this patch and add zram as l2arc to a zfs pool, which results in alot
of l2arc checksum error.
this suggests zram is corrupting data.
Maybe we should use PAGE_SIZE+1 to indicate uncompressed pages?
Original comment by DRDarkRa...@gmail.com
on 16 Oct 2012 at 5:50
@DRDarkRaven: I found a bug which could cause this corruption. Can you please
try the patch attached? Thanks.
Original comment by nitingupta910@gmail.com
on 17 Oct 2012 at 5:15
Attachments:
Reopening the bug (though I could not reproduce the corruption myself)
Original comment by nitingupta910@gmail.com
on 17 Oct 2012 at 5:16
@DRDarkRaven: can you please verify if the patch provided in comment #14 works?
Also, what's the kernel version you are using?
Original comment by nitingupta910@gmail.com
on 19 Oct 2012 at 9:35
@nitingupta910: I applied your v2 patch, and I don't see any more of the "zram:
Error allocating memory for compressed page:" kind of errors. I am using /tmp
on zram.
Original comment by ppu...@gmail.com
on 10 Nov 2012 at 9:36
does this bug cause data loss? i ask because i have a server i'd rather not
reboot that ran a few hours getting this error. i have since turned off
compcache and nothing (no other processes) seems unhappy - nothing has crashed.
thank you.
Original comment by a...@cichlid.com
on 14 Nov 2012 at 3:14
Apparently, zram_pagealloc_fix.patch was introduced in Kernel 3.6 at some
point. Using /dev/zram0 as swap device (which was no problem in earlier kernel
releases) under the kernel 3.6 series up to 3.6.8, I get a completely
unrecoverable system freeze when allocating swap by filling up the ramdisk (dd
if=/dev/zero of=/tmp/zero.img bs=1M count=800).
Reverting the patch and patching with zram_pagealloc_fix_v2.patch instead fixes
the problem for me.
Original comment by knopp...@googlemail.com
on 28 Nov 2012 at 11:20
@knopperk: Most probably you are hitting this bug:
https://bugzilla.kernel.org/show_bug.cgi?id=50081
The fix has been posted to lkml and is under review. Should be in mainline soon.
Original comment by nitingupta910@gmail.com
on 29 Nov 2012 at 9:31
Original issue reported on code.google.com by
viech...@gmail.com
on 3 Sep 2012 at 5:44Attachments: