gramineproject / gramine-tdx

A library OS for Linux multi-process applications, with Intel TDX support (experimental)
GNU Lesser General Public License v3.0
15 stars 2 forks source link

[PAL/vm-common] Jump to ring-3 at the start of the executable #23

Open dimakuv opened 8 months ago

dimakuv commented 8 months ago

Previous Gramine (gramine-direct and gramine-sgx) always executes in ring-3 and thus doesn't have a hook to add a ring0 -> ring3 transition before jumping from LibOS init phase to the application executable.

These are the particular places where this jump happens:


Ideally, we want to introduce some hook / callback to PAL so that it can do its own "jump to userspace executable" logic. Or we can add a macro that will be something like:

#ifdef VM/TDX
    ... prepare GPRs ...
    sysretq
#else
   ... current logic ...
  jmp rdi
#endif

How it works now? Well, the executable (which is typically ld.so) starts in ring-0, and only after the first syscall invocation is finished, the executable will run in ring-3 (in case of ld.so, the first syscall is `brk()). That's because our VM/TDX wrapper around syscalls is like this: https://github.com/gramineproject/gramine-tdx/blob/f4405d38d1a3b5e45146e25d07f589ab31d4e006/pal/src/host/vm-common/kernel_events.S#L145-L148

mkow commented 8 months ago

Why can't we start its execution in ring3? ld.so does many things we rather shouldn't be doing in vm-ring0 if it's possible to do them in vm-ring3.

dimakuv commented 8 months ago

Why can't we start its execution in ring3?

Because our LibOS code doesn't currently have a place where it could give control to PAL to do additional things (like switching from ring-0 to ring-3).

More specifically, I think the LibOS should leave the exact way of jumping into the application to the PAL (i.e., libos_elf_entry.nasm should be a PAL-specific code).

But I didn't want to modify the LibOS component at all, because this would be a rather intrusive change. So I left fixing this problem for later, when we have the code open-sourced and everyone agreeing on the general direction and design of TDX.

dimakuv commented 2 months ago

There is also a problem around LibOS (1) delivering signals to the application and (2) performing rt_sigreturn() to the previously saved app context.

The problem stems from the fact that LibOS just assumes to execute in ring-3 always. So LibOS saves the app context as-is and then restores the app-context as is, immediately jumping to it. LibOS is not aware of the ring-0/ring-3 wrapper that we introduce in VM-based PALs.

For a hacky partial solution, see https://github.com/gramineproject/gramine-tdx/pull/36.

But ideally we still need some reasonable way to conduct to the LibOS that it can't just jmp *app_context_rip. Instead, LibOS needs to invoke PAL's wrappers to exit from the syscall into the app context.