gramineproject / graphene

Graphene / Graphene-SGX - a library OS for Linux multi-process applications, with Intel SGX support
https://grapheneproject.io
GNU Lesser General Public License v3.0
771 stars 260 forks source link

[PAL/Linux-SGX] Exception handling wrongly distinguishes ocall interruption #2603

Closed boryspoplawski closed 3 years ago

boryspoplawski commented 3 years ago

Description of the problem

Current master (3182ca948c6712631a3ac9bf8983b8103351c9a9) does not recognize part of case A (you can check the comments in enclave_entry.S for naming conventions), as it only checks if rip is in range [Locall_about_to_eexit_begin, Locall_about_to_eexit_end), but code in that range calls __restore_xregs https://github.com/oscarlab/graphene/blob/master/Pal/src/host/Linux-SGX/enclave_entry.S#L626 and this function does not lie in that range.

Steps to reproduce

Was able to hit this on #2602 in LibOS regression test poll_closed_fd. It manifests in ocall_read error (-EPERM). Curious reader can check why.

dimakuv commented 3 years ago

Is the correct fix to this: rework enclave_entry.S? Or an additional check [begin-of-restore-xregs, end-of-restore-xregs)?

boryspoplawski commented 3 years ago

What do you mean? Both would fix the issue. I'll submit a fix soon (which might not survive the rework if it's ever finished completely (instead of partial). I just wanted to make sure it fixes the problem and it does: I was able to run 15k runs without any issues (took me ~9h).

dimakuv commented 3 years ago

Given that you will submit a fix soon, my question is irrelevant :) (I was asking basically: will you "rework everything" and subsequently fix this bug, or will you make a more pointed fix.)

mkow commented 3 years ago

an additional check [begin-of-restore-xregs, end-of-restore-xregs)?

Would this really work? This function has more than one callsite.

boryspoplawski commented 3 years ago

Don't worry, I have a working fix already (but want to merge other stuff first).