Closed GoogleCodeExporter closed 9 years ago
Can you see these messages in log (/var/log/messages):
"zram: Error allocating memory for compressed page: xxx"
If zram fails to allocate memory for incoming pages, write fails and you will
get data mismatch as in your test. Its NOT data corruption.
Anyways, please upload you log file so we may look into this further.
Original comment by nitingupta910@gmail.com
on 27 Jan 2011 at 3:35
There is no such message, not in the log, not on the console and not readable
with dmesg.
Even if it was the case, I would rather expect an error from the block layer
instead
of silent data corruption. Applications don't read logfiles :-).
btw. some history: When 2.6.37 came up with the "real" implementation of zram, I
made a filesys in the zram device and tried to compile a kernel in it (always a
good stresstest :-). The system had 2G ram, and there have been no oom
conditions. But the
Compiler failed with an "impossible" error. To track down the problem I wrote
this script. Another version of the script tried the same without a filesystem,
but it tends to crash the system before showing the error. Anyway, I attach it
here.
regards
Original comment by fadb24bb...@drewag.de
on 28 Jan 2011 at 9:40
Attachments:
I just compiled kernel over zram with disksize of 4G -- no problems at all.
With 2G disksize, I got "No space left on device error" and no zram memory
allocation error in logs (which is a good thing). So, maybe you just ran "out
of disk space" when compiling kernel over zram?
Also, I tried with random data test as in your script -- no problems again:
$ openssl rand -base64 "$((1*1024*1024*1024))" > ~/temp/rand.orig
cp ~/temp/rand.orig ./rand # copied to mounted /dev/zram0 of 4G disksize
$ md5sum ~/temp/rand.orig # version on disk
aaed1e376f3a9332fd3ad5ce07f19d37 /home/ngupta/temp/rand.orig
$ md5sum rand # version on zram
aaed1e376f3a9332fd3ad5ce07f19d37 rand
Original comment by nitingupta910@gmail.com
on 2 Feb 2011 at 1:49
Ran your script as-is, still no problems found:
3468316672
0 tmpmnt
0
total used free shared buffers cached
Mem: 16400548 8236912 8163636 0 198328 6418696
-/+ buffers/cache: 1619888 14780660
Swap: 18481148 0 18481148
cycle 0:
59636 tmpmnt
60940288
cmprc is cmp: EOF on tmpmnt/tmpfile
total used free shared buffers cached
Mem: 16400548 8357396 8043152 0 198336 6478360
-/+ buffers/cache: 1680700 14719848
Swap: 18481148 0 18481148
cycle 1:
119272 tmpmnt
60940288
cmprc is cmp: EOF on tmpmnt/tmpfile
total used free shared buffers cached
Mem: 16400548 8416540 7984008 0 198336 6537604
-/+ buffers/cache: 1680600 14719948
Swap: 18481148 0 18481148
cycle 2:
178908 tmpmnt
Original comment by nitingupta910@gmail.com
on 2 Feb 2011 at 1:54
Very strange. Of course your mileage may vary dependend on memory and the
parameters in the script. May I have missed some update bit ? In order to get
(hopefully) really reproducable conditions I made a qemu image. Can you please
unpack and run it with
"qemu -m 384 -hda hdaz". You will find the kernel config in /boot/... . The
image is also mountable with "mount -ro loop,offset=32256 hdaz"
regards
Hmmm, upload ist restricted - please collect and run
"cat xaa xab xac xad | gunzip >hdaz"
Original comment by fadb24bb...@drewag.de
on 3 Feb 2011 at 1:42
Attachments:
part #2
Original comment by fadb24bb...@drewag.de
on 3 Feb 2011 at 1:43
Attachments:
[deleted comment]
part #3
Original comment by fadb24bb...@drewag.de
on 3 Feb 2011 at 1:44
Attachments:
finally: part #4
Original comment by fadb24bb...@drewag.de
on 3 Feb 2011 at 1:45
Attachments:
Thanks for the VM image. I tested this on another 32-bit (Fedora) VM and
strangely enough it happens consistently on any 32-bit system and NOT on 64-bit.
Original comment by nitingupta910@gmail.com
on 4 Feb 2011 at 6:14
Found a bug which was causing read/write from/to incorrect sectors. Can you try
the patch attached? (all tests now pass on my side)
Original comment by nitingupta910@gmail.com
on 5 Feb 2011 at 11:54
Attachments:
I have committed this change to the repository and gregkh promised it would be
included in 2.6.38 and probably in maintainance release of 2.6.37 too.
Please reopen if you still hit this issue.
Original comment by nitingupta910@gmail.com
on 8 Feb 2011 at 1:51
Confirm it is working. Script does not fail and kernel compiles successfully
:-).
Thanks for your effort.
Original comment by fadb24bb...@drewag.de
on 8 Feb 2011 at 3:25
Original issue reported on code.google.com by
fadb24bb...@drewag.de
on 26 Jan 2011 at 12:35Attachments: