`GeneralPurposeAllocator` reports `error.OutOfMemory` despite having more than enough free memory.

IntegratedQuantum commented 7 months ago

Zig Version

0.12.0-dev.2150+63de8a598 (linux)

Steps to Reproduce and Observed Behavior

I observed that my application would sometimes crash with an OutOfMemory error. This didn't make any sense to me. In total my system was reporting just around 40% memory usage, and additionally the error would happen during a phase where a lot of memory was freed. There were also no large allocations happening and the allocation where it crashed was a mere 24 bytes.

I could trace the error back to the mmap call in the page allocator. From the linux documentation, one of the sources for out of memory is this:

       ENOMEM The process's maximum number of mappings would have been
              exceeded.  This error can also occur for munmap(), when
              unmapping a region in the middle of an existing mapping,
              since this results in two smaller mappings on either side
              of the region being unmapped.

This gave me some clues to make a simple reproducible:

const std = @import("std");

var global_gpa = std.heap.GeneralPurposeAllocator(.{.thread_safe=true}){};
const allocator = global_gpa.allocator();
var allocations: [200000][]u8 = undefined;

pub fn main() !void {
    for(0..allocations.len) |i| {
        allocations[i] = try allocator.alloc(u8, 8192);
    }
    std.log.err("Allocations done", .{});
    for(0..allocations.len) |i| { // Freeing every second allocation, to maximize the number of individual mappings
        if(i % 2 == 0) {
            allocator.free(allocations[i]);
        }
    }
    _ = try allocator.alloc(u8, 1); // Allocating anything causes OutOfMemory
}

Output:

$ zig run test.zig
info: Allocations done
thread 39168 panic: reached unreachable code
Unwind error at address `exe:0x1061799` (error.OutOfMemory), trace may be incomplete

Unable to dump stack trace: OutOfMemory
Aborted (core dumped)
$ zig run test.zig -OReleaseFast
error: Allocations done
error: OutOfMemory

Expected Behavior

From a general purpose allocator I expect it to be able to fully use the memory the system can provide(minus internal fragmentation of course). The c_allocator doesn't have this problem, because I think it doesn't unmap pages as aggressively as the GPA.

klkblake commented 5 months ago

This replicates the issue directly on std.heap.page_allocator:

const std = @import("std");
const page_allocator = std.heap.page_allocator;

var pages: [128 * 1024]*[4096]u8 = undefined;

pub fn main() !void {
    for (&pages) |*page| {
        page.* = try page_allocator.create([4096]u8);
    }

    for (pages, 0..) |page, i| {
        if (i & 1 == 0) {
                continue;
        }
        page_allocator.destroy(page);
    }
}

(Exact number of pages needed may need to be tweaked based on your system configuration)

The problem is with Zig's posix implementation of munmap():

/// Deletes the mappings for the specified address range, causing
/// further references to addresses within the range to generate invalid memory references.
/// Note that while POSIX allows unmapping a region in the middle of an existing mapping,
/// Zig's munmap function does not, for two reasons:
/// * It violates the Zig principle that resource deallocation must succeed.
/// * The Windows function, VirtualFree, has this restriction.
pub fn munmap(memory: []align(mem.page_size) const u8) void {
    switch (errno(system.munmap(memory.ptr, memory.len))) {
        .SUCCESS => return,
        .INVAL => unreachable, // Invalid parameters.
        .NOMEM => unreachable, // Attempted to unmap a region in the middle of an existing mapping.
        else => unreachable,
    }
}

The documentation clearly suggests a model where each call to mmap() creates a new mapping, and so as long as you call munmap() with the same bounds as each mmap(), it cannot fail. Unfortunately, this is incorrect, at least on Linux: when allocating anonymous memory with mmap(), the kernel tries to allocate a region of address space adjacent to an existing mapping, and will opportunistically merge with that mapping wherever possible. The result is that most calls to mmap() only extend an existing mapping, not create a new one, and thus also most paired calls to munmap() are in fact unmapping part of a mapping,

Unfortunately this means that in general memory deallocation on Linux can fail and this needs to be worked around by userspace. I believe the usual approach is to keep track of regions that you've failed to unmap so you can coalesce them with new unmap requests until you either reach a big enough region that it can be unmapped without splitting a mapping, or else unrelated unmap requests get the system away from the vm.max_map_count limit. The problem, of course, is that coalescing is only robust as a solution if regions are being coalesced in a single place for the whole process, and not separate places for Zig and for libc's allocator and whatever other libraries are in play that might be directly performing munmap() calls (and probably just permanently leaking their regions if an error occurs).

andrewrk commented 1 month ago

@klkblake thank you for this breakdown and analysis. This is maddening, and I will need to go through the 5 stages of grief before suggesting a course of action.

ziglang / zig