hyojeonglee / osfall2019

Undergraduate Operating Systems course (2019 fall)
17 stars 8 forks source link

Interrupt가 들어왔을때 크래시가 나는 문제 #37

Open djdjdjh opened 5 years ago

djdjdjh commented 5 years ago
[  110.965771] Unable to handle kernel paging request at virtual address 00003fc8
[  110.968799] Mem abort info:
[  110.969351]   Exception class = DABT (current EL), IL = 32 bits
[  110.969828]   SET = 0, FnV = 0
[  110.969988]   EA = 0, S1PTW = 0
[  110.970142] Data abort info:
[  110.970560]   ISV = 0, ISS = 0x00000006
[  110.970758]   CM = 0, WnR = 0
[  110.971357] user pgtable: 4k pages, 39-bit VAs, pgd = ffffffc06527e000
[  110.972077] [0000000000003fc8] *pgd=00000000a6761003, *pud=00000000a6761003, *pmd=0000000000000000
[  110.972846] Internal error: Oops: 96000006 [#1] PREEMPT SMP
[  110.973711] Modules linked in:
[  110.974527] CPU: 3 PID: 1034 Comm: sleep Not tainted 4.14.67-v8-qemu+ #74
[  110.974840] Hardware name: linux,dummy-virt (DT)
[  110.975749] task: ffffffc0650bd700 task.stack: ffffff800b0d8000
[  110.976622] PC is at cpuacct_charge+0x34/0xa8
[  110.977911] LR is at update_curr+0x98/0x228
[  110.978260] pc : [<ffffff80080ec54c>] lr : [<ffffff80080d5030>] pstate: a00001c5
[  110.978803] sp : ffffff800b0dba60
[  110.978985] x29: ffffff800b0dba60 x28: ffffffc06ffb0480 
[  110.979768] x27: ffffff80092aa7c0 x26: ffffffc06d57d780 
[  110.980104] x25: ffffff80092aa000 x24: 0000000000000008 
[  110.980295] x23: ffffffc06ffb0480 x22: 0000000001c8e1f0 
[  110.980460] x21: 0000000000000001 x20: 0000000001c8e1f0 
[  110.980858] x19: ffffffc0650b9d00 x18: 0000000000000000 
[  110.981157] x17: 0000000000000000 x16: ffffff8008111500 
[  110.981402] x15: 0000000000000000 x14: 00000000004bb6d4 
[  110.981611] x13: 00000000ff9cea5c x12: 0000000000000000 
[  110.981902] x11: 00000000ff9cea94 x10: 00000000004ce000 
[  110.982104] x9 : 000000000000075a x8 : 0000000000000400 
[  110.982268] x7 : 000000000000075a x6 : 0000000002739d7c 
[  110.982424] x5 : 00ffffffffffffff x4 : 0000000000000015 
[  110.983490] x3 : 0000000000000000 x2 : ffffffffff76abc0 
[  110.983734] x1 : 0000000000003ec0 x0 : 0000000000003ec0 
[  110.983940] Process sleep (pid: 1034, stack limit = 0xffffff800b0d8000)
[  110.984213] Call trace:
[  110.984388] Exception stack(0xffffff800b0db920 to 0xffffff800b0dba60)
[  110.984723] b920: 0000000000003ec0 0000000000003ec0 ffffffffff76abc0 0000000000000000
[  110.984970] b940: 0000000000000015 00ffffffffffffff 0000000002739d7c 000000000000075a
[  110.985185] b960: 0000000000000400 000000000000075a 00000000004ce000 00000000ff9cea94
[  110.985396] b980: 0000000000000000 00000000ff9cea5c 00000000004bb6d4 0000000000000000
[  110.985624] b9a0: ffffff8008111500 0000000000000000 0000000000000000 ffffffc0650b9d00
[  110.985878] b9c0: 0000000001c8e1f0 0000000000000001 0000000001c8e1f0 ffffffc06ffb0480
[  110.986092] b9e0: 0000000000000008 ffffff80092aa000 ffffffc06d57d780 ffffff80092aa7c0
[  110.986308] ba00: ffffffc06ffb0480 ffffff800b0dba60 ffffff80080d5030 ffffff800b0dba60
[  110.986528] ba20: ffffff80080ec54c 00000000a00001c5 0000000000000003 ffffffc06d553e00
[  110.986825] ba40: 0000007fffffffff ffffff80092a9b20 ffffff800b0dba60 ffffff80080ec54c
[  110.987497] [<ffffff80080ec54c>] cpuacct_charge+0x34/0xa8
[  110.987842] [<ffffff80080d5030>] update_curr+0x98/0x228
[  110.988085] [<ffffff80080d6108>] dequeue_task_fair+0x68/0x528
[  110.988251] [<ffffff80080ce440>] deactivate_task+0xa8/0xf0
[  110.988406] [<ffffff80080da9f4>] load_balance+0x454/0x960
[  110.988693] [<ffffff80080db2a4>] pick_next_task_fair+0x3a4/0x6c8
[  110.988987] [<ffffff8008b00bac>] __schedule+0x104/0x898
[  110.989226] [<ffffff8008b01374>] schedule+0x34/0x98
[  110.989459] [<ffffff8008b05084>] do_nanosleep+0x7c/0x168
[  110.989739] [<ffffff80081113f4>] hrtimer_nanosleep+0xa4/0x128
[  110.990058] [<ffffff8008111570>] compat_SyS_nanosleep+0x70/0x90
[  110.990300] Exception stack(0xffffff800b0dbec0 to 0xffffff800b0dc000)
[  110.990825] bec0: 00000000ff9cea6c 0000000000000000 00000000f7b82590 00000000ff9cea6c
[  110.991237] bee0: 0000000005f5e100 0000000000000000 00000000004b8e80 00000000000000a2
[  110.991768] bf00: 0000000000000000 0000000000000000 00000000004ce000 00000000ff9cea94
[  110.992084] bf20: 0000000000000000 00000000ff9cea5c 00000000004bb6d4 0000000000000000
[  110.992521] bf40: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  110.993118] bf60: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  110.993655] bf80: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  110.994227] bfa0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  110.995517] bfc0: 00000000f7ab4a10 0000000060040010 00000000ff9cea6c 00000000000000a2
[  110.995998] bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  110.996235] [<ffffff8008083b18>] __sys_trace_return+0x0/0x4
[  110.996618] Code: 52800035 f9401260 8b010000 b4000080 (f9408400) 
[  110.997850] ---[ end trace cdf6df9c4e5b5c95 ]---
[  110.998316] note: sleep[1034] exited with preempt_count 2

안녕하세요. 테스트 프로그램 실행 중 위와 같이 hrtimer_nanosleep, hrtimer_wakeup, wait등의 interrupt가 들어오면 위와 같은 로그가 찍히면서 크래시가 납니다. 항상 fair scheduler의 클래스 함수를 거쳐서 간 뒤에 크래시가 나는데, fair.c는 수정한적이 없습니다. 원인을 좀처럼 알 수가 없는데, 어떤 부분을 고쳐야 되는지 알 수 있을까요?

djdjdjh commented 5 years ago

Interrupt가 들어오지 않았을 때 wrr 스케쥴러의 다른 부분은 정상적으로 동작하는 것을 확인했습니다

hyojeonglee commented 5 years ago

라즈베리파이로 테스트하는 경우 비슷한 문제가 발생한 사례가 있는데 정확한 원인은 파악하지 못했지만 보드상의 문제인 것으로 예상하고 있다고 합니다.

+그래서 위와 같은 문제가 발생하신 분들은 보드로 테스트하신거라면 qemu로도 같은 문제가 발생하는지 확인해보시고 이때는 문제가 없으면 readme에 해당 상황 언급하고 제출해주셔도 됩니다.

djdjdjh commented 5 years ago

Qemu로 테스트 할 시에 발생했습니다. 라즈베리파이에서는 테스트해보지 못했습니다.

hyojeonglee commented 5 years ago

더 알아본 결과 qemu에서도 발생 가능한 상황이니 위와 같은 커널 패닉이 나는 케이스에 대해서는 제외하고 채점하겠습니다. 아래 조건에 모두 해당하는 경우입니다.