Open dimakuv opened 9 months ago
There are two memory allocators in Gramine: MEMMGR and SLAB.
Both allocators rely on the following shared logic:
system_malloc()
and system_free()
are macros that are defined inside LibOS and in PAL
SYSTEM_LOCK()
and SYSTEM_UNLOCK()
are macros that are defined inside LibOS and in PAL
Used to allocate specific objects in specific subsystems. Currently used only in LibOS.
Each subsystem of LibOS that uses MEMMGR specifies its own (global to the subsystem) lock. Thus, MEMMGR object allocs/frees in the same subsystem are synchronized on this lock, but object allocs/frees in different subsystems can run in parallel.
The current users of MEMMGR:
struct libos_vma
struct libos_lock vma_mgr_lock
struct libos_lock vma_mgr_lock
struct libos_mount
struct libos_lock g_mount_mgr_lock
struct libos_dentry
libos_lock dcache_mgr_lock
struct libos_handle
struct libos_lock handle_mgr_lock
Every managed object is wrapped into a MEM_OBJ_TYPE
struct. This struct doesn't have additional fields (if not built with ASan), so it's the most compact representation possible. When the object is "freed" and its underlying memory is moved to the free list, MEM_OBJ_TYPE
's list field is used instead.
Design and implementation are very simple:
The MEMMGR memory managers are never "reset", or shrunk, or deleted. Thus, if LibOS allocated a lot of MEMMGR objects initially, and then freed them all, then this MEMMGR memory is leaked. This should be a very rare and unimportant case though.
Backend-memory (page-sized) allocation happens via __system_malloc()
declared here: https://github.com/gramineproject/gramine/blob/master/libos/src/libos_malloc.c
Backend-memory (page-sized) deallocation, as mentioned above, doesn't really happen. But if it would, then it would be via __system_free()
also declared here: https://github.com/gramineproject/gramine/blob/master/libos/src/libos_malloc.c
padding
)Objects in migrated memory (in the child) leak because their "slots" are never re-used: https://github.com/gramineproject/gramine/blob/e740728548ef52615cffdb64f573a998abdfa61f/common/include/memmgr.h#L248-L251
Names are bad, in particular get_mem_obj_from_mgr_enlarge(MEM_MGR mgr, size_t size)
-- the size
here is actually "by how many bytes to increase the pool of available memory if there is not free memory in areas and no free slots in the free list".
Something is fishy with size
arguments in functions. These args are in bytes (at least that's what the callers assume), but function implementations seem to treat the argument as count
. TODO: verify and fix this; we may have a memory leak here.
Generic backend for malloc and free in all other subsystems. Used both in LibOS and in PAL.
When any (random size) object needs to be allocated/freed in LibOS or in PAL, the traditional malloc()
and free()
calls are used. They are wrappers around slab_alloc()
and slab_free()
. See:
Backend-memory (page-sized) allocation and deallocation is implemented via:
__system_malloc()
and __system_free()
in https://github.com/gramineproject/gramine/blob/master/libos/src/libos_malloc.csystem_mem_alloc()
and system_mem_free()
in https://github.com/gramineproject/gramine/blob/master/pal/src/slab.cThere is a single global slab manager and corresponding lock for LibOS and similarly a single global slab manager and corresponding lock for PAL. See these:
NOTE We have a struct libos_lock
in LibOS. This lock is implemented via PalEventWait()
and PalEventSet()
which do have a fast path, but the slow path results in ocall_futex()
, which is super-expensive in SGX. Maybe we could replace it with a spinlock()
? Technically most of the time we'll hit the slab-allocator's cache, which is a fast operation; so it seems like no real need to ocall_futex()
in this case.
Design and implementation is based on the MEMMGR allocator for the common case, and has a trivial fallback for the large-object case:
Deallocation happens similarly to the allocation description above:
level
value.level == -1
, it means that the object's size is greater than the max allowed in SLAB, so it was allocated via backend-memory allocation, thus it must be deallocated via backend-memory free: https://github.com/gramineproject/gramine/blob/f35d8e034c1d9bf7b6e2da9125b64a25b622b1d9/common/include/slabmgr.h#L402-L412No realloc()
implementation.
realloc()
, then we have many places in LibOS and PAL that could benefit from this function (currently they implement such realloc in ad-hoc ways combining malloc
memcpy
and free
). SLAB_CANARY
is defined for LibOS, but not for PAL.
Description of the feature
Memory allocators in Gramine were written ~15 years ago. It's time to re-evaluate them and propose alternatives.
I want to start a discussion on this topic. In particular, the next posts will contain information on:
malloc
andfree
in all other subsystems (both in LibOS and in PAL)Why Gramine should implement it?
We see more and more performance bottlenecks in memory/object allocation inside Gramine itself. They all stem from the fact that our memory allocators are old, ad-hoc, not scalable, and not CPU- and cache-friendly; also our allocators do not perform object caching.
Just a couple of recent examples:
For possible alternatives (in Rust), see this comment: https://github.com/gramineproject/gramine/pull/1723#pullrequestreview-1864216238