jovanbulck / sgx-step

A practical attack framework for precise enclave execution control
GNU General Public License v3.0
442 stars 84 forks source link

sgx_destroy_enclave blocks app in kernel #90

Open jovanbulck opened 1 month ago

jovanbulck commented 1 month ago

Example dmesg:

[  484.355618] INFO: task app:8986 blocked for more than 120 seconds.
[  484.355643]       Tainted: G           OE     5.15.0-124-generic #134-Ubuntu
[  484.355665] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  484.355688] task:app             state:D stack:    0 pid: 8986 ppid:  8985 flags:0x00004002
[  484.355691] Call Trace:
[  484.355692]  <TASK>
[  484.355694]  __schedule+0x24e/0x590
[  484.355698]  schedule+0x69/0x110
[  484.355699]  schedule_timeout+0x105/0x140
[  484.355701]  ? __queue_delayed_work+0x5c/0xa0
[  484.355703]  ? queue_delayed_work_on+0x3d/0x60
[  484.355705]  __wait_for_common+0xab/0x150
[  484.355706]  ? usleep_range_state+0x90/0x90
[  484.355708]  wait_for_completion+0x24/0x30
[  484.355709]  __synchronize_srcu.part.0+0x7f/0xf0
[  484.355712]  ? __bpf_trace_rcu_stall_warning+0x10/0x10
[  484.355714]  synchronize_srcu+0xfb/0x120
[  484.355716]  mmu_notifier_unregister+0xbc/0xf0
[  484.355719]  sgx_release+0x94/0x140
[  484.355722]  __fput+0x9c/0x280
[  484.355723]  ____fput+0xe/0x20
[  484.355725]  task_work_run+0x6a/0xb0
[  484.355726]  exit_to_user_mode_loop+0x157/0x160
[  484.355729]  exit_to_user_mode_prepare+0xa0/0xb0
[  484.355731]  syscall_exit_to_user_mode+0x27/0x50
[  484.355733]  ? x64_sys_call+0x1e07/0x1fa0
[  484.355736]  do_syscall_64+0x63/0xb0
[  484.355738]  ? exit_to_user_mode_prepare+0x37/0xb0
[  484.355740]  ? syscall_exit_to_user_mode+0x2c/0x50
[  484.355741]  ? x64_sys_call+0x1de6/0x1fa0
[  484.355743]  ? do_syscall_64+0x63/0xb0
[  484.355744]  ? __x64_sys_openat+0x55/0x90
[  484.355746]  ? exit_to_user_mode_prepare+0x37/0xb0
[  484.355748]  ? syscall_exit_to_user_mode+0x2c/0x50
[  484.355750]  ? x64_sys_call+0x1a55/0x1fa0
[  484.355752]  ? do_syscall_64+0x63/0xb0
[  484.355753]  ? x64_sys_call+0x1e3e/0x1fa0
[  484.355755]  ? do_syscall_64+0x63/0xb0
[  484.355755]  ? clear_bhb_loop+0x45/0xa0
[  484.355758]  ? clear_bhb_loop+0x45/0xa0
[  484.355760]  ? clear_bhb_loop+0x45/0xa0
[  484.355762]  ? clear_bhb_loop+0x45/0xa0
[  484.355764]  ? clear_bhb_loop+0x45/0xa0
[  484.355766]  entry_SYSCALL_64_after_hwframe+0x6c/0xd6
[  484.355768] RIP: 0033:0x7fa070ccba7b
[  484.355770] RSP: 002b:00007ffd1025fe18 EFLAGS: 00000206 ORIG_RAX: 000000000000000b
[  484.355772] RAX: 0000000000000000 RBX: 00005580d4c8e8c0 RCX: 00007fa070ccba7b
[  484.355773] RDX: 0000000000000000 RSI: 0000000000200000 RDI: 00007fa070400000
[  484.355774] RBP: 00007ffd1025fe94 R08: 00005580d4c8e8c0 R09: 0000000000000000
[  484.355775] R10: 0000000000000000 R11: 0000000000000206 R12: 00007fa070bac1c0
[  484.355775] R13: 00007fa070bac188 R14: 00007fa070bac188 R15: 00007fa070bac200
[  484.355777]  </TASK>

app is blocked in D state and reboot is the only remedy

After some digging, it seems this is caused by an explicit call to sgx_destroy_enclave before process exit.

From the call trace above, the problem seems to be caused by:

Possible hypothesis:

heavyimage commented 1 month ago

Problem is happening on a Comet Lake i9-10900K

$ uname -srvp
Linux 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64
jovanbulck commented 1 month ago

Fwiw: I think what may go wrong here is that rcu calls the scheduler timeout so the kernel configures the apic tsc_deadline but sxgstep still has it in oneshot mode so the timer never fires and the rcu check blocks somehow