littlekernel / lk

LK embedded kernel
MIT License
3.15k stars 621 forks source link

[linker][x86_64] page fault crash happens when calling lk_init_level() in lk_main() #69

Closed bingzhux closed 8 years ago

bingzhux commented 8 years ago

Travis, I found a new issue was introduced from this patch recently: https://github.com/travisg/lk/commit/114a350e55da9b9f4301a362afef517baf70ea4d

I encountered a serious crash issue (x86_64) during startup (lk_main()), when calling lk_primary_cpu_init_level (). Here is the crash log from lk_init_level()

lk_init_level:61: looking at 0x226e10 (version) level 0x3ffff, flags 0x1, seen_last 0 lk_init_level:61: looking at 0x226e28 (S粈p") level 0x0, flags 0x0, seen_last 0 lk_init_level:61: looking at 0x226e40 ---- page fault happens here ----

The above log printed from the below "for-loop" in lk_init_level() function, you can see that every iteration, the ptr++ will increase address by 0x18. for (const struct lk_init_struct *ptr = __lk_init; ptr != __lk_init_end; ptr++)

However, my investigation shows that the structure "lk_init_struct" is aligned with 0x10-byte. see the output file lk.elf.sym.sorted. so the above iteration, the address should increase by 0x20. 0000000000226e10 g .lk_init 0000000000000000 __lk_init 0000000000226e10 g O .lk_init 0000000000000018 _init_struct_version 0000000000226e30 g O .lk_init 0000000000000018 _init_struct_vm 0000000000226e50 g O .lk_init 0000000000000018 _init_struct_vm_preheap 0000000000226e68 g .lk_init 0000000000000000 __lk_init_end

You can see that the size of the structure "_initstruct*" is 0x18, this is correct, but, their address is aligned with 0x10 (incremental is 0x20 for each structure), as below: 0000000000226e10 0000000000226e30 0000000000226e50

Notes:

  1. x64 32bit is OK, but I think it is just a coincidence. Not sure if ARM has the similar issue.
  2. this issue also happens in other sections like ".apps", or ".drivers" 0000000000226ca0 g .apps 0000000000000000 __apps_start 0000000000226ca0 g O .apps 0000000000000028 _app_shell 0000000000226ce0 g O .apps 0000000000000028 _app_stringtests 0000000000226d20 g O .apps 0000000000000028 _app_tests 0000000000226d48 g .apps 0000000000000000 __apps_end
travisg commented 8 years ago

Ah hah. Now I understand why that code was in there before. I couldn't figure out what the fix was for when it was added a while back on x86-64. For whatever reason it only affects x86-64, arm64 doesn't have this problem.

I'll take a look at it tomorrow. I really dont want the special case ARCH_X86_64 thing polluting the code, so have to find a better solution for it.

bingzhux commented 8 years ago

Travis, Can you please help to check the output file "lk/build-pc-x86-64-test/lk.elf.sym.sorted" in your build machine to see if this alignment issue exists or not?

I found there is a link script to make init structure with 8-byte alignment, but have no idea why it does work. https://github.com/travisg/lk/blob/master/top/init.ld https://github.com/travisg/lk/blob/master/app/app.ld https://github.com/travisg/lk/blob/master/dev/drivers.ld https://github.com/travisg/lk/blob/master/dev/devices.ld

Besides, it seems that we need to figure out a solution to get it fixed by covering both 32bit and 64bit (for x86, arm or other ARCHs).

Currently, my solution is just to make them aligned with 4-bytes. see my patch here, please let me know if it is ok or not. https://github.com/bzhu5/lk/commit/012231bdfb82158320ea92c28278bbd1a51a6447

travisg commented 8 years ago

I think this is fixed now. Going to close now, re-open if it's still broken on your end.

bingzhux commented 8 years ago

yes, I cannot reproduce this issue any more.