gramineproject / gramine

A library OS for Linux multi-process applications, with Intel SGX support
GNU Lesser General Public License v3.0
599 stars 200 forks source link

[PAL/Linux-SGX] Use `XSAVEC` or `XSAVEOPT` instead of `XSAVE` instruction #1939

Open dimakuv opened 4 months ago

dimakuv commented 4 months ago

Description of the feature

Gramine-SGX currently uses the XSAVE instruction on EEXIT and XRSTOR on EENTER:

There is anecdotal evidence that optimized instructions XSAVEC and XSAVEOPT can benefit workloads that perform many EEXIT/EENTER (e.g., workloads that perform many OCALLs).

Interestingly, Intel SGX PSW already uses XSAVEC instead of XSAVE, assuming that XSAVEC is always available on SGX-supporting processors: https://github.com/intel/linux-sgx/blob/c1ceb4fe146e0feb1097dee81c7e89925443e43c/psw/urts/se_detect.cpp#L80-L81

Two more notes:

  1. IIUC, XSAVEC is a drop-in replacement for XSAVE and thus can be immediately used.
  2. IIUC, XSAVEOPT needs a separate XSAVE area that is not modified/overwritten by other code. This is required because XSAVEOPT checks whether the XSAVE state was modified, by consulting the XSAVE area (restored previously with XRSTOR). This ultimately requires to have a separate per-thread XSAVE stack.

Why Gramine should implement it?

XSAVE is used on every EEXIT. Each OCALL leads to an EEXIT, as well as each return from an interrupt handler. So the overhead of saving this state can be significant.

XSAVEC and XSAVEOPT can reduce this overhead, leading to overall performance improvement in OCALL-heavy workloads, e.g. Redis.

dimakuv commented 4 months ago

Comments from an expert:

  1. XSAVEC is only a drop-in replacement for XSAVE if XSAVEC and XRSTOR are the only instructions that read/write to that XSAVE area. There are some scenarios where software (e.g., an exception handler) may need to examine and possibly modify the XSAVE area. In these scenarios, XSAVEC (and also XSAVEOPT) cannot be used as a drop-in replacement for XSAVE because the format/layout of the XSAVE area will be different from XSAVE.

  2. This statement is not quite accurate: “XSAVEOPT checks whether the XSAVE state was modified, by consulting the XSAVE area (restored previously with XRSTOR).” When XRSTOR is executed, it clears the processor’s “modified” tracker bits. As soon as some extended state is updated (e.g., an AVX register), the associated “modified” bit is set. Then when the next XSAVEOPT is executed to write to the same XSAVE area used by the most recent XRSTOR, XSAVEOPT consults with the tracker bits to determine what state has been modified, and thus what state must be updated in the XSAVE area. This behavior is described in Section 13.9 of the SDM, volume 1.