gramineproject / gramine

A library OS for Linux multi-process applications, with Intel SGX support
GNU Lesser General Public License v3.0
597 stars 199 forks source link

[LibOS] When memory pages or access permissions in VMA bookkeeping are not changed, get rid of unnecessary calls to PAL memory ABI or EDMM operations #1708

Open llly opened 10 months ago

llly commented 10 months ago

Description of the feature

In current Gramine LibOS VMA bookkeeping, address begin, address end, prot, flags and file with offset are recorded.

However, when VMAs are created or changed successfully, only address is returned to caller. Then a corresponding PAL memory function is always called even pages of host already meet the expected status. In SGX PAL, extra Ocalls are called, When EDMM enabled, extra EDMM functions are called to set EPCM permissions.

My proposal is adding a flag to indicate what kind of page flags and permissions changes (change page flags, change file mapping, initialize permissions, restrict permissions, extend permissions, maybe more), returned by VMA bookkeeping functions and passed to PAL memory functions. When all page flags and permissions are unchanged, LibOS doesn't call PAL memory function. When pages flags and/or permissions are changed. LibOS call PAL memory function with this flag. PAL memory function can call mmap/mprotect/EMODPE/EMODPR accordingly.

It requires that LibOS VMA bookkeeping records exact same memory pages and access permissions as host PAL. Need to identify and fix all violations, for example, https://github.com/gramineproject/gramine/blob/1f72aaf8148a7712c5c8bad1da9839e2bcc5c633/libos/src/bookkeep/libos_thread.c#L100 only change permissions for PAL.

Why Gramine should implement it?

Performance improved. We've seen a workload that calls mmap(RW) to allocate some pages then calls mprotect(RW) on each page. And for SGX EDMM. PTE permissions and the EPCM permissions can sync.

dimakuv commented 10 months ago

@llly Good observation and good idea, thanks.

@kailun-qin implemented a new LibOS upcall get_vma_info(), which PALs can call to learn current VMA information (page protections, and I think currently that's all). See https://github.com/gramineproject/gramine/pull/1513 and the discussion here https://github.com/gramineproject/gramine/discussions/1706

I think this new upcall will be enough to avoid OCALLs/EDMM calls. PALs will call this upcall, learn the current VMA info, compare with the requested info, and if the same, just exit the func.

@kailun-qin @mkow What do you think? Sounds like we have another usage candidate for get_vma_info() from #1513, so I am now highly in favor of implementing this upcall.

@llly Would this upcall be enough for your purposes?

kailun-qin commented 10 months ago

Sounds like we have another usage candidate for get_vma_info() from https://github.com/gramineproject/gramine/pull/1513, so I am now highly in favor of implementing this upcall.

I recall that we might also have another candidate: https://github.com/gramineproject/gramine/discussions/1527#discussioncomment-6873358.

llly commented 10 months ago

@dimakuv It's different. get_vma_info() is upcall from PAL to LibOS for PalExceptionHandler, only handle one page each time. I propose call from LibOS to PAL. LibOS can handle one VMA each time.

@kailun-qin It's similar, but focus on EDMM. I'd like LibOS VMA bookkeeping to handle address, flag and prot of LibOS memory. Currently it only handles address and passthrough flag and prot to PAL.

dimakuv commented 10 months ago

@llly Hm, I may be missing something. Now that I look at your example:

We've seen a workload that calls mmap(RW) to allocate some pages then calls mprotect(RW) on each page.

Why can't the LibOS consult itself during mprotect() that this mprotect is basically a no-op? LibOS itself knows everything there is need to know about the memory ranges, so why any additional code in PAL?

llly commented 10 months ago

Yes. That's what I means. I don't need an upcall. LibOS can handle prot in VMA bookkeeping.

dimakuv commented 10 months ago

Thank you @llly, then I misunderstood your initial request. Yes, now it totally makes sense to me.