tthtlc / compcache

Automatically exported from code.google.com/p/compcache
0 stars 0 forks source link

swapon fails on android G1 (ARM) #33

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?
Compiling compcache with the latest CodeSourcery toolchain for arm against
kernel 2.6.27.
Push compcache modules to device.
insmod works fine, dmesg reports normal output
swapon reboots the device.

This however works fine in the android emulator, I have no idea why this
happens.

Any ideas?  Or ideas on how to debug this?  I'm thinking put some sleeps in
whatever code is triggered on a swapon, but I'm not sure where to look.  If
you could give me any clues that'd be very helpfull.

Original issue reported on code.google.com by aagaa...@gmail.com on 25 Jun 2009 at 7:32

GoogleCodeExporter commented 9 years ago
I have slightly different results with compcache on android-msm-2.6.29- I am 
able to
swapon, but afterwards any process allocating memory immediately segfaults.

Original comment by steve.ko...@gmail.com on 25 Jun 2009 at 7:48

GoogleCodeExporter commented 9 years ago
I've tried with gcc, the device still works fine after swapon, what follows 
after
that is very similar to behavior described in 'Issue 2', that is, processes get
segfaults or bus errors.

Original comment by ondra.herman@gmail.com on 25 Jun 2009 at 7:57

GoogleCodeExporter commented 9 years ago
So you got compcache working reliably after swapon on a G1?  If so what gcc 
version,
what tools did you use to build it, did you compile the android kernel / lzo 
modules
with that compiler as well or only compcache?

Original comment by aagaa...@gmail.com on 26 Jun 2009 at 6:56

GoogleCodeExporter commented 9 years ago
> I have slightly different results with compcache on android-msm-2.6.29- I am 
able to
swapon, but afterwards any process allocating memory immediately segfaults.

Do you also see any warnings from compcache in kernel logs?
Its quite difficult for me to debug this issue since I don't have this H/W also 
my
lack of experience with this processor.

Original comment by nitingupta910@gmail.com on 26 Jun 2009 at 8:32

GoogleCodeExporter commented 9 years ago
Hopefully steve.kondik can get you some more output, personally with a cat 
/proc/kmsg
I'm getting nothing before the device reboots.

Original comment by aagaa...@gmail.com on 26 Jun 2009 at 3:02

GoogleCodeExporter commented 9 years ago
Ok, got some more to report, with 2.6.29 things seem a lot better.

lzo built into the kernel, insmod's and swapon actually work.

I can check /proc/ramzswap and see high GoodCompress, but after torturing it a 
while
it crashes the user interface (is my theory, it's not a reboot as I dont loose
connection with the phone, and can watch the kernel logs this time).

Note that the send sigkill to process is completely normal, and happens all the 
time
on android.  However there seems to be 0 output from compcache in here.

<4>[  316.945526] send sigkill to 568 (app_process), adj 14, size 4436
<4>[  324.601165] select 612 (app_process), adj 15, size 4411, to kill
<4>[  324.601196] send sigkill to 612 (app_process), adj 15, size 4411
<6>[  346.488891] binder: release 134:323 transaction 6478 in, still active
<6>[  346.489135] binder: send failed reply for transaction 6478 to 194:505
<6>[  346.744750] binder: 194 invalid dec strong, ref 1079 desc 17 s 0 w 1
<6>[  346.754028] binder: 423 invalid dec strong, ref 8585 desc 17 s 0 w 1
<6>[  346.760559] binder: 585 invalid dec strong, ref 9347 desc 17 s 0 w 1
<6>[  348.089965] request_suspend_state: wakeup (0->0) at 341189074786 
(2009-06-28
19:54:03.283935557 UTC)
<3>[  348.092315] init: untracked pid 371 exited
<3>[  348.093719] init: untracked pid 383 exited
<3>[  348.094207] init: untracked pid 390 exited
<3>[  348.094635] init: untracked pid 414 exited
<3>[  348.133636] init: untracked pid 190 exited
<3>[  348.133911] init: untracked pid 273 exited
<3>[  348.134277] init: untracked pid 621 exited
<3>[  348.140106] init: untracked pid 266 exited
<3>[  348.140563] init: untracked pid 352 exited
<3>[  348.160003] init: untracked pid 194 exited
<3>[  348.160461] init: untracked pid 423 exited
<3>[  348.160705] init: untracked pid 585 exited
<6>[  381.844940] request_suspend_state: wakeup (0->0) at 374944049146 
(2009-06-28
19:54:37.038909917 UTC)
<6>[  384.697967] binder: release 112:127 transaction 10775 in, still active
<6>[  384.698333] binder: send failed reply for transaction 10775 to 645:653
<6>[  385.784729] htc-acoustic: open
<6>[  385.845764] htc-acoustic: mmap
<6>[  385.846740] htc-acoustic: ioctl
<6>[  385.846954] htc-acoustic: ioctl: ACOUSTIC_ARM11_DONE called 678.
<6>[  385.849548] htc-acoustic: ioctl: ONCRPC_ACOUSTIC_INIT_PROC success.
<6>[  385.849792] htc-acoustic: release
<6>[  385.890563] snd_set_device 1 1 1
<6>[  385.901885] snd_set_volume 0 0 5
<6>[  385.903289] snd_set_volume 1 0 5
<6>[  385.912017] snd_set_volume 3 0 5
<6>[  385.913360] snd_set_volume 2 0 5
<6>[  386.833923] snd_set_volume 256 0 5

Original comment by aagaa...@gmail.com on 28 Jun 2009 at 7:57

GoogleCodeExporter commented 9 years ago
Checking adb logcat during a soft restart, and also during an application that 
fails
to start.

I'm not all that much smarter from this output, and I'm a bit unsure where to 
go from
here debugging this.

Original comment by aagaa...@gmail.com on 29 Jun 2009 at 5:28

Attachments:

GoogleCodeExporter commented 9 years ago
Ah, I don't have this hardware and there is nothing in logs that can help me 
debug
this issue.

I promise a bounty of $100 for the one who gets it working on ARM :)   I am 
serious!

Original comment by nitingupta910@gmail.com on 1 Jul 2009 at 9:47

GoogleCodeExporter commented 9 years ago
What about posting the debug output to the google android dev group.  There's a 
few
google employees that monitor that board.  Maybe they can help out.

http://groups.google.com/group/android-platform

Original comment by dwa...@gmail.com on 1 Jul 2009 at 4:22

GoogleCodeExporter commented 9 years ago
For the record, it seems to work fine on the Beagleboard, an ARM-based single 
board
computer.  This is a Cortex-A8, while the G1 uses an ARM11; that could 
certainly be a
factor.

Details:
- Kernel and compcache were built natively on the Beagleboard, using a standard
Debian gcc 4.3.2.
- I'm running a kernel 2.6.30 from the linux-omap git tree, no other patches.
- compcache 0.5.3 built just fine, and "use_ramzswap.sh 32768 /dev/mmcblk0p3" 
ran
fine with no errors.
- As a quick stress test, I fired up firefox in a VNC session, resulting in
/proc/ramzswap giving ~24k reads, ~32k writes, ~75M OrigDataSize, ~23M 
ComprDataSize.
 This sure looks like it's actually working.  (Also, firefox was actually usable,
which is a first for me on this board).
- Finally, useuse_ramzswap got rid of the swap as expected.

I'm not sure how helpful this is; the hardware is pretty different from the G1. 
 But
it does suggest that there's hope, since it works on at least one ARM device.

Original comment by edana...@gmail.com on 5 Jul 2009 at 2:52

GoogleCodeExporter commented 9 years ago
It is useful, but could you try stress testing it some more, I can also get 
ramzswap
to report everything working, it's not until after some stress testing has 
occured
that things actually start to fail.

Original comment by aagaa...@gmail.com on 5 Jul 2009 at 10:01

GoogleCodeExporter commented 9 years ago
Hi, 
I have been monitoring the functionality of compcache on Nokia N810, which has a
OMAP2420 processor, which is of course ARM. 
I am getting similar errors with my N810, like random reboots at times. I have 
been
monitoring the dmesg and /proc/ramzswap but no avail at this point. The kernel
version the N810 uses is 2.6.21-omap1. Maybe some kernel debugging would help 
on this
but I'm not familiar with such "lore" :). So I am just reporting a different ARM
device on this thread.

So swapon and {use,unuse}_ramzswap.sh works but after a while of usage (like 
opening
the browser and pdf reader), the tablet crashes with unknown reason.

Original comment by suomalai...@gmail.com on 5 Jul 2009 at 11:20

GoogleCodeExporter commented 9 years ago
More stress testing on the Beagleboard; a full kernel compile on -j8 (typically
something like ~50M in swap according to free, and gcc processes were definitely
swapping), combined with bits of firefox, stress (
http://weather.ou.edu/~apw/projects/stress/ ) for another 30M-60M of memory 
usage,
and video streaming to my laptop.

No faults as far as I can tell after several hours and over 11M reads and 6M 
writes
according to /proc/ramzswap.  It also shows no FailedReads/Writes or InvalidIO, 
and
the resulting kernel works.  I'd say it's solid.

If there are any particular tests that might be helpful, let me know.  And if
anything does come up, I'll be sure to update.

Original comment by edana...@gmail.com on 6 Jul 2009 at 3:12

GoogleCodeExporter commented 9 years ago
Thanks you all for help till now.

Summarizing a bit:
 - Cortex-A8 (Beagleboard): seems to work fine.
 - OMAP2420 (Nokia N810): no problems with module load/unload and swapon/swapoff but
apps crash or system reboots after some time.
 - ARM11 (Android G1): swapon reboots the device.

I will try reading about these ARM variations and maybe we will get some clues 
...

Original comment by nitingupta910@gmail.com on 6 Jul 2009 at 3:23

GoogleCodeExporter commented 9 years ago
Its possible that the issue here is the same as described here:

http://www.linux-mips.org/archives/linux-mips/2008-11/msg00038.html

Original comment by nitingupta910@gmail.com on 6 Jul 2009 at 3:25

GoogleCodeExporter commented 9 years ago
I'd just like to point out that on recent kernels on android, the device doesn't
reboot, the interface does.  Which is a rather big difference, as the kernel 
stays up.

Note that the device works fine with normal swap.

Original comment by aagaa...@gmail.com on 6 Jul 2009 at 7:45

GoogleCodeExporter commented 9 years ago
Yeah I just tested the latest compcache on N810 last night and experience only
interface freezing, but the device is still reacting to button presses and ssh
connection is alive, although dmesg revealed nothing special. 

Original comment by suomalai...@gmail.com on 6 Jul 2009 at 9:56

GoogleCodeExporter commented 9 years ago
I can confirm that compcache doesn't reboot my android g1 with the 2.6.29 
kernel.

I've set up a 8meg compcache swapfile with swappiness to 60 and it actually 
works
pretty well.

I can get things to crash left and right if I set swappiness to 100 though.

Original comment by dwa...@gmail.com on 9 Jul 2009 at 7:30

GoogleCodeExporter commented 9 years ago
Seems like once the swapfile starts getting full and reaching the end of the 
file,
that's when processes start crashing.  There's some corruption somewhere.

Original comment by dwa...@gmail.com on 9 Jul 2009 at 7:34

GoogleCodeExporter commented 9 years ago
[deleted comment]
GoogleCodeExporter commented 9 years ago
> I've set up a 8meg compcache swapfile with swappiness to 60 and it actually 
works
> pretty well.
> I can get things to crash left and right if I set swappiness to 100 though.

With swappiness set to 100, compcache will quickly fill up. Maybe with so much 
memory
pinned with compcache, you are running into OOM Killer? Do you see any oom kill
messages in logs?

> Seems like once the swapfile starts getting full and reaching the end of the 
file,
> that's when processes start crashing.  There's some corruption somewhere.

Seems like a good test case. I can try this atleast on my system (x64).

Original comment by nitingupta910@gmail.com on 9 Jul 2009 at 11:21

GoogleCodeExporter commented 9 years ago
The processes aren't being killed.   They're crashing with segfaults and 
tracebacks.

Original comment by dwa...@gmail.com on 9 Jul 2009 at 3:55

GoogleCodeExporter commented 9 years ago
Could there be an issue with the kernel writing in the same memory space that 
the
compcache swap is residing, since both are using the same memory space.

Original comment by dwa...@gmail.com on 9 Jul 2009 at 6:27

GoogleCodeExporter commented 9 years ago
I would require following data
 - /proc/cpuinfo
 - /proc/meminfo
 - /var/log/messages (on some systems its /var/log/kernel)

Above data is need for *each* of following devices:
 - Cortex-A8 (Beagleboard)
 - OMAP2420 (Nokia N810)
 - ARM11 (Android G1)

Original comment by nitingupta910@gmail.com on 10 Jul 2009 at 2:21

GoogleCodeExporter commented 9 years ago
Here's cpuinfo and meminfo.  There is no /var/log/messages or /var/log/kernel 
on the
android g1.

# cat /proc/cpuinfo
cat /proc/cpuinfo
Processor       : ARMv6-compatible processor rev 2 (v6l)
BogoMIPS        : 245.36
Features        : swp half thumb fastmult edsp java
CPU implementer : 0x41
CPU architecture: 6TEJ
CPU variant     : 0x1
CPU part        : 0xb36
CPU revision    : 2

Hardware        : trout
Revision        : 0080
Serial          : 0000000000000000
# cat /proc/meminfo
cat /proc/meminfo
MemTotal:          97908 kB
MemFree:            2192 kB
Buffers:             536 kB
Cached:            24640 kB
SwapCached:           12 kB
Active:            37608 kB
Inactive:          44392 kB
Active(anon):      27096 kB
Inactive(anon):    30292 kB
Active(file):      10512 kB
Inactive(file):    14100 kB
Unevictable:         252 kB
Mlocked:               0 kB
SwapTotal:          8188 kB
SwapFree:           7244 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         57088 kB
Mapped:            14676 kB
Slab:               6152 kB
SReclaimable:        868 kB
SUnreclaim:         5284 kB
PageTables:         3072 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:       57140 kB
Committed_AS:     742092 kB
VmallocTotal:     155648 kB
VmallocUsed:       53404 kB
VmallocChunk:      44028 kB
#

Original comment by dwa...@gmail.com on 10 Jul 2009 at 2:26

GoogleCodeExporter commented 9 years ago
> Here's cpuinfo and meminfo.  There is no /var/log/messages or /var/log/kernel 
on the
android g1.

Ok, then send output of:
 - uname -a
 - kernel config file (/boot/config*) -- maybe this will be missing on g1.

Original comment by nitingupta910@gmail.com on 10 Jul 2009 at 3:13

GoogleCodeExporter commented 9 years ago
# uname -a
Linux localhost 2.6.29-cm #1 PREEMPT Thu Jul 2 19:13:31 EDT 2009 armv6l 
GNU/Linux

Original comment by dwa...@gmail.com on 10 Jul 2009 at 3:24

Attachments:

GoogleCodeExporter commented 9 years ago
All info from the same device, for config.gz I should be on the same (or very
similar) to dwang5's.  I dont have the config for the exact build I'm using.

Linux localhost 2.6.29-cm #2 PREEMPT Sun Jun 28 02:29:13 EDT 2009 armv6l 
GNU/Linux

For kernel log I pulled /proc/kmsg

Original comment by aagaa...@gmail.com on 10 Jul 2009 at 11:52

Attachments:

GoogleCodeExporter commented 9 years ago
On ARMv6 and newer:
 - Caches are VIPT (Virtually Indexed, Physically Tagged)
 - Writeback caches.

So, I think, these crashes are happening due to following:
 - On swap read, ramzswap gets a 'bio' page which is mapped to kernel VA address, say
V(k). All above systems have mem <= 1G. So, kmap simply gives lowmem address.
 - The data cache at location corresponding to VA == V(k) now contains decompressed
data. This data cache location is tagged with decompressed page's physical 
address,
say P.
 - However, corresponding RAM location still contains some stale data (writeback cache).
 - Now this page is mapped to userspace VA, say at V(u).
 - The data cache at location V(u) has a tag different from P (decompressed page's
physical address). So, it goes to RAM to fetch the data.
 - The corresponding RAM location still has some stale data. We fetch this stale data
at cache location for VA == V(u) <---------------
 - Thus user gets some stale data and it segfaults.

Thus, we need to do flush_dcache_page() after writing out decompressed page in
ramzswap_read(). However, as mentioned in this mail:
http://www.linux-mips.org/archives/linux-mips/2008-11/msg00038.html
... this solution will not work "as is" but still, some workaround should be 
doable.

I will try to upload a custom compcache version with this fix and lets see if it
solves the issue.

Original comment by nitingupta910@gmail.com on 14 Jul 2009 at 8:41

GoogleCodeExporter commented 9 years ago
Ok, sounds great, will try it when you have the version on my N810.

Original comment by suomalai...@gmail.com on 14 Jul 2009 at 8:44

GoogleCodeExporter commented 9 years ago
Awsome, based on how much this helps on my main computer (with 4gb memory) I 
can't 
imagine how much of an improvement it'll be on my G1 with 96mb memory.  I'll be 
testing the second you push out a test version.

Original comment by aagaa...@gmail.com on 14 Jul 2009 at 10:09

GoogleCodeExporter commented 9 years ago
Looking forward to the new version as well.  Thank!

Original comment by dwa...@gmail.com on 14 Jul 2009 at 2:39

GoogleCodeExporter commented 9 years ago
Please try compcache test version attached. Thanks.

Original comment by nitingupta910@gmail.com on 15 Jul 2009 at 2:36

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks Nitin!

Would somebody mind posting the compiled android .29 modules?  Thanks!

Original comment by dwa...@gmail.com on 15 Jul 2009 at 2:56

GoogleCodeExporter commented 9 years ago
Totally untested, going to bed now will test tomorrow morning.

Original comment by aagaa...@gmail.com on 15 Jul 2009 at 8:42

Attachments:

GoogleCodeExporter commented 9 years ago
thank you! thank you!

Running with a 24meg swap file (25% of available ram) and swappiness set to 
100.  No
crashes!

Running imeem streaming music player in the background while loading up gmail,
calendar, browser, maps, and market.

awesome!

Original comment by dwa...@gmail.com on 15 Jul 2009 at 8:56

GoogleCodeExporter commented 9 years ago
here's the cat output.  74% compression, is that good?

# cat /proc/ramzswap
cat /proc/ramzswap
DiskSize:          24476 kB
NumReads:          99829
NumWrites:         55941
FailedReads:           0
FailedWrites:          0
InvalidIO:             0
PagesDiscard:          0
ZeroPages:           228
GoodCompress:         74 %
NoCompress:            6 %
PagesStored:        5890
PagesUsed:          2249
OrigDataSize:      23560 kB
ComprDataSize:      8333 kB
MemUsedTotal:       8996 kB
#

Original comment by dwa...@gmail.com on 15 Jul 2009 at 10:49

GoogleCodeExporter commented 9 years ago
one question, is the swappiness setting considered?  Will using 60 or 100 make a
difference?

Original comment by dwa...@gmail.com on 15 Jul 2009 at 10:50

GoogleCodeExporter commented 9 years ago
Issue 2 has been merged into this issue.

Original comment by nitingupta910@gmail.com on 16 Jul 2009 at 2:11

GoogleCodeExporter commented 9 years ago
> Running with a 24meg swap file (25% of available ram) and swappiness set to 
100.  No
crashes!

Great news! Just to confirm, did  you run test on G1 or some emulator?

> here's the cat output.  74% compression, is that good?
Its a bit unusual. I usually see ~90% for GoodCompress. Also, 6% for 
"NoCompress"
doesn't look too good.

> one question, is the swappiness setting considered?  Will using 60 or 100 
make a
difference?

Higher the swappiness, more quickly ramzswap will fill up. For kernel its just
another swap device for swappiness values applies.

Original comment by nitingupta910@gmail.com on 16 Jul 2009 at 3:37

GoogleCodeExporter commented 9 years ago
Actual g1 hardware.

Original comment by dwa...@gmail.com on 16 Jul 2009 at 3:39

GoogleCodeExporter commented 9 years ago
Testing this as well on a G1, using kernel 2.6.29

jacHEROski ROM 1.4C (kernel is CM's)

Everything loaded just fine. 

# cat /proc/ramzswap                 
DiskSize:      63473 kB
MemLimit:      14684 kB
NumReads:        507
NumWrites:      2577
FailedReads:           0
FailedWrites:          0
InvalidIO:         0
PagesDiscard:          0
ZeroPages:       117
GoodCompress:        100 %
NoCompress:        0 %
PagesStored:        1820
PagesUsed:       352
OrigDataSize:       7280 kB
ComprDataSize:      1394 kB
MemUsedTotal:       1408 kB
BDevNumReads:        108
BDevNumWrites:       640

I have a 64mb swap partition that I am using in conjunction.

Question: Does the lzo_compress.ko and lzo_decompress.ko have to be loaded as 
well?

Original comment by greg.hy...@gmail.com on 16 Jul 2009 at 4:11

GoogleCodeExporter commented 9 years ago
> Question: Does the lzo_compress.ko and lzo_decompress.ko have to be loaded as 
well?

Yes, they must be loaded.

Original comment by nitingupta910@gmail.com on 16 Jul 2009 at 4:19

GoogleCodeExporter commented 9 years ago
lzo_compress.ko and lzo_decompress.ko modules are already loaded in cyanogen's 
kernel.

Original comment by dwa...@gmail.com on 16 Jul 2009 at 4:48

GoogleCodeExporter commented 9 years ago
Fantastic, then this works wonderful!

The music app on Hero is actually usable now, and the people app works 
fantastic!

Original comment by greg.hy...@gmail.com on 16 Jul 2009 at 4:51

GoogleCodeExporter commented 9 years ago
It seems to be in pretty widespread testing on G1 now, without any reports of 
crashes
: http://forum.xda-developers.com/showthread.php?t=537236

Very nice work!

Original comment by aagaa...@gmail.com on 16 Jul 2009 at 8:44

GoogleCodeExporter commented 9 years ago
Ok... so now status of the issue is:
 1- Cortex-A8 (Beagleboard)  -- seems to work even without the fix (see comment #13).
 2- OMAP2420 (Nokia N810)    -- crashes without fix. No testing done with the fix.
 3- ARM11 (Android G1)       -- crashes without fix. Fix resolved the issue.

So, now testing is needed for case (2): Nokia 810.
(test version uploaded in comment #33).

Original comment by nitingupta910@gmail.com on 16 Jul 2009 at 8:49

GoogleCodeExporter commented 9 years ago
Yup, I'm gonna test it when I get my VMWare running again to compile the testing
version in scratchbox.

Original comment by suomalai...@gmail.com on 16 Jul 2009 at 9:02

GoogleCodeExporter commented 9 years ago
Ok I got it compiled in the device itself, no problems whatsoever. Thanks for 
this
Nitin :)

Original comment by suomalai...@gmail.com on 16 Jul 2009 at 9:30

GoogleCodeExporter commented 9 years ago
Nokia-N810-43-7:~# free
              total         used         free       shared      buffers
  Mem:       126796       124004         2792            0            4
 Swap:        31692        31688            4
Total:       158488       155692         2796
Nokia-N810-43-7:~# cat /proc/ramzswap
DiskSize:          31696 kB
NumReads:           5688
NumWrites:         13113
FailedReads:           0
FailedWrites:          0
InvalidIO:             0
PagesDiscard:          0
ZeroPages:           180
GoodCompress:         52 %
NoCompress:           20 %
PagesStored:        7743
PagesUsed:          4295
OrigDataSize:      30972 kB
ComprDataSize:     15267 kB
MemUsedTotal:      17180 kB

Original comment by suomalai...@gmail.com on 16 Jul 2009 at 9:32