otterkit / otterkit-cobol

A free and open source Standard COBOL compiler for 64-bit environments
https://otterkit.com
Apache License 2.0
249 stars 15 forks source link

[Update]: Merge our Virtual Heap Memory Allocator (vhmalloc) implementation into main (plus refactoring). #34

Closed KTSnowy closed 1 year ago

KTSnowy commented 1 year ago

There are some additional changes that ended up in the allocator branch that we need to merge with this as well, so this includes a few more files. There's also some big improvements for CASPAL that are needed in order to support cross-platform virtual memory allocation.

The code below is the CASPAL abstraction for OS-dependent virtual memory allocation, which is needed for the allocator design to work. I didn't want to pollute the allocator's source code with preprocessor spaghetti of #if defineds to conditionally compile into the appropriately function for each OS, so I abstracted those away into something that works mostly the same on all 3.

#if defined PlatformWindows
    #include <memoryapi.h>

    #define SYS_READWRITE PAGE_READWRITE
    #define SYS_PROTECTED PAGE_NOACCESS

    // On Windows, attempting to reserve an address that's already reserved will fail.
    // This is contrary to mmap's behavior, which will just overwrite the existing mapping.
    #define SYS_ALLOCATE MEM_COMMIT | MEM_RESERVE
    #define SYS_RESERVE MEM_RESERVE

    // Wish we had these on Unix, but we don't. Would make the intent clearer.
    #define SYS_COMMIT MEM_COMMIT
    #define SYS_DECOMMIT MEM_DECOMMIT

    // Only really needed on Windows, but we define it anyway for consistency.
    #define SYS_RELEASE MEM_RELEASE

    // Requests more virtual memory from the operating system (Windows Edition).
    #define SystemAlloc(addr, size, prot, flags) VirtualAlloc(addr, size, flags, prot)

    // This also releases the address space, so it shouldn't be used to decommit virtual memory.
    // Use both SystemCommit and SystemDecommit for that instead.
    #define SystemDealloc(addr, size) VirtualFree(addr, size, SYS_RELEASE)

    // Must be used with an address within a reserved address space (returned by SystemAlloc).
    #define SystemCommit(addr, size) VirtualAlloc(addr, size, SYS_COMMIT, SYS_READWRITE)

    // On Windows, we decommit (only release physical memory) by calling VirtualFree with MEM_DECOMMIT.
    // (according to the documentation, this is the correct way to do it)
    #define SystemDecommit(addr, size) VirtualFree(addr, size, SYS_DECOMMIT)

#elif defined PlatformLinux || defined PlatformDarwin
    #include <sys/mman.h>

    #define SYS_READWRITE PROT_READ | PROT_WRITE
    #define SYS_PROTECTED PROT_NONE

    // These 2 have duplicate flags, but it's easier to maintain this way.
    // This makes the intent of the code using them clearer, and more portable.
    #define SYS_ALLOCATE MAP_PRIVATE | MAP_ANONYMOUS
    #define SYS_RESERVE MAP_PRIVATE | MAP_ANONYMOUS

    // These 2 also have duplicate flags, same reason as above.
    #define SYS_COMMIT MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS
    #define SYS_DECOMMIT MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS

    // Not needed on Linux and macOS, but we define it anyway for consistency.
    #define SYS_RELEASE 0

    // Requests more virtual memory from the operating system (Unix Edition).
    #define SystemAlloc(addr, size, prot, flags) mmap(addr, size, prot, flags, -1, 0)

    // This also releases the address space, so it shouldn't be used to decommit virtual memory.
    // Use both SystemCommit and SystemDecommit for that instead.
    #define SystemDealloc(addr, size) munmap(addr, size)

    // Must be used with an address within a reserved address space (returned by SystemAlloc).
    #define SystemCommit(addr, size) mmap(addr, size, SYS_READWRITE, SYS_COMMIT, -1, 0)

    // On Linux and macOS, we decommit (only release physical memory) by calling mmap with PROT_NONE.
    // This will overwrite the existing mapping, and the pages will be physically released.
    #define SystemDecommit(addr, size) mmap(addr, size, SYS_PROTECTED, SYS_DECOMMIT, -1, 0)
#endif
KTSnowy commented 1 year ago

@GitMensch, if you have time I'd like to request your review of the abstracted API for the OS-specific virtual memory functions. Once we finish the allocator, this PR will close our manual memory allocation issue (#28), and we'll use it for every dynamic memory allocation in the Otterkit runtime.

GitMensch commented 1 year ago

I'll definitely have a look, but this won't happen anytime soon, more "in weeks".

KTSnowy commented 1 year ago

Hey everyone, GREAT NEWS! After doing a very basic benchmark of it, our allocator appears to be 4.5 times faster (both functions) than the system malloc on Windows. I ran the benchmark a couple of times and it seems consistent, our malloc and free pair is 4.5 times faster.

AllocBench

This is a very very basic benchmark, but it does show promising results. We're calling malloc, attempting to write to the start and end of the block, and freeing it right after. The calls are opaque to the C# compiler, so it won't optimize those out.

Capture

The NativeMemory.Alloc method calls the system malloc directly (from what I've seen in their source code).

@GitMensch It looks like the likely macro ended up helping a bit so I'll keep using it, at least in the allocator code.