runtime: reserve address space for large slice backing stores to make slice growth cheaper

snadrus commented 1 year ago

When append() sees the underlying array is used to capacity, it allocates a bigger array, copies values, zeroes the rest, and returns.

This proposal will make this grow() logic cheaper:

If the OS offered an extend() then it would be possible to avoid the copy. However this is unlikely because the following virtual address space is often claimed already.

Q: In supported 64-bit OSes, what if all unbounded requests asked the OS for 1TB of reserved memory? A: Then most all grow() calls could just claim and zero the next chunk of reserved memory. That's because extra reserved memory is just an address until it page-faults into existence.

In practical terms, I see 2 ways to go: 1) Expressing 1TB of capacity to the user would require either zeroing on exposure. 2) Expressing a much smaller capacity to the user would require the allocator to have a separate capacity concept that it would consult to determine if a new allocation is necessary (typically not).

There's likely a refinement sweet-spot around the variables of:

at what growth size should we allocate a giant allocation?
how big is giant allocation (1TB in the example above)?
should we have a smaller "giant allocation" for initial slices, like initially 1MB then 1TB?
Does this upset the Linux OOM killer?

mateusz834 commented 1 year ago

I don't think that this is safe to do, consider:

a := make([]byte, 16, 16)
b := append(a, 1)
if &a[0] == &b[0] {
    panic("sth weird")
}

This is not going to fail with the current go version.

snadrus commented 1 year ago

The difference @mateusz834 found is not guaranteed The spec says:

If the capacity of s is not large enough to fit the additional values, append allocates a new, sufficiently large underlying array that fits both the existing slice elements and the additional values. Otherwise, append re-uses the underlying array.

There's no guarantee of an address change. An advanced version of the compiler could do the same.

mknyszek commented 1 year ago

Slice backing arrays are just allocated from the usual heap (except where they're of course, stack allocated). Something like this would be complex to implement since it would basically have to be a separate mechanism entirely. All the usual GC paths would also need to know about this special address space mapping for each slice backing store.

I think the worst part here might be the overhead of mapping new memory for a new slice backing store to begin with; it's a fairly expensive operation, and small slices are common.

Unless, you mean to apply this optimization to only large slices, which is more feasible. Today, it wouldn't be too hard to implement an "try extend" operation in the page allocator, though I suspect it wouldn't succeed often; within a GC cycle, the runtime is generally good at densely packing memory at the granularity of pages, since most page allocations consist of a single page.

Your memory reservation idea for large slice backing stores only (say, >2 MiB or something) I think has merit, though I suspect at that point a much more effective workaround is to just overallocate the capacity of the slice in user code to begin with. The runtime is generally okay about not zeroing newly-mapped memory, so there should be no zeroing overhead. (And the fact that the capacity is oversized should mean that copies are avoided as well.)

golang / go

runtime: reserve address space for large slice backing stores to make slice growth cheaper #57337