AndreiLux / Perseus-UNIVERSAL5410

This is a clean kernel glued together upon Samsung's last public branch of android-exynos-3.4 with sources provided by the OSRC releases. The OSRC release has been stripped clean of all non-i9500 related code and sourced, as much as possible through the original patches.
https://github.com/AndreiLux/Perseus-UNIVERSAL5410/wiki
Other
30 stars 19 forks source link

Probably fix for spontaneous reboots. #3

Closed sorgelig closed 11 years ago

sorgelig commented 11 years ago

I've got 2 reboots while my phone was sleeping. After looking into /proc/last_kmsg i've found that both times it's been caused by icmp_echo due to unaligned access:

[ 9478.913222] L0  10376 Unhandled fault: alignment exception (0x001) at 0xdfb480ba
[ 9478.913670] L0  10376 Internal error: : 1 [#1] PREEMPT SMP ARM
[ 9478.915284] L0  10376 PC is at icmp_echo+0x30/0x68
[ 9478.915525] L0  10376 LR is at 0x4
[ 9478.915892] L0  10376 pc : [<c058d348>]    lr : [<00000004>]    psr: 60000153
[ 9478.915993] L0  10376 sp : e081b978  ip : e081b988  fp : e081b9ec
[ 9478.916552] L0  10376 r10: c0abb218  r9 : e5d8c000  r8 : e081a000
[ 9478.916810] L0  10376 r7 : c0abb218  r6 : da5be600  r5 : c105ebc0  r4 : da5bed80
[ 9478.917225] L0  10376 r3 : 00000000  r2 : da5bed80  r1 : dfb480ba  r0 : da5bed80
[ 9478.917643] L0  10376 Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[ 9478.917928] L0  10376 Control: 10c5387d  Table: 5bcf806a  DAC: 00000015 

this particular instruction: .text:C058D348 LDMIA R1, {R0,R1}

Of course it cannot read from such address as 0xdfb480ba. I'm not sure if it's usual way to have unaligned data in this structure or not, but i have this issue.

Comparing to Samsung's source, I've found that unaligned access trap is not enabled in Perseus kernel.

Bellow is the patch to fix this issue. I'm not 100% it will be fixed, but i didn't encounter spontaneous reboots since then. I've saw some users on XDA reported about spontaneous reboots while phone is in sleep state. So, most likely they have the same issue as me.

From cd4a31bf2022b1d15a1ba03307b9936b567d2670 Mon Sep 17 00:00:00 2001
From: sorg <pour.garbage@gmail.com>
Date: Tue, 11 Jun 2013 17:11:22 +0800
Subject: [PATCH] Test-Fix: spontaneous reboots

---
 arch/arm/Kconfig |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 284fc40..fcd7302 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1889,7 +1889,7 @@ config LEDS_CPU
 config ALIGNMENT_TRAP
    bool
    depends on CPU_CP15_MMU
-   default n if ARCH_EBSA110
+   default y if !ARCH_EBSA110
    select HAVE_PROC_CPU if PROC_FS
    help
      ARM processors cannot fetch/store information which is not
-- 
1.7.9.5
AndreiLux commented 11 years ago

I disabled the alignment trap on purpose, the architecture should have no issue with non-aligned access.

I'll investigate this further, might be due to how the toolchain generated the code.

sorgelig commented 11 years ago

Probably, in THUMB mode there won't be a problem since it's aligned at 2. But this part of code is in ARM mode. If you will find a better solution for unaligned access, please post it here.

AndreiLux commented 11 years ago

We should try to move to newer GCC versions. I had tried 4.7 at the beginning of the kernel, but it gave me page faults on the exFat modules which I didn't figure out. 4.8 would be even better.

I read on some unaligned issues which have been fixed in 4.6.3, so I guess I'll test that next.

sorgelig commented 11 years ago

I've tried 4.6.3 and 4.7.(forgot) compilers and checked code of icmp_echo function. I didn't find anything different there. Anyway, if i understood right, unaligned access trap just intercept exception, read the data (let's say byte by byte) and return it to original code. If it really so, then why don't let this trap to live? It won't affect performance as long as data is aligned. And it's better than reboots ;)

AndreiLux commented 11 years ago

Yes I'm reverting the commit for the time being.