Enabling StaticCompiler.jl-based compilation of (some) Julia code to standalone native binaries by avoiding GC allocations and llvmcall-ing all the things!
This might cause some testing problems on v1.9+ without https://github.com/MasonProtter/Bumper.jl/pull/31 being merged, because of a circular dependency with StaticCompiler.jl, but other than that I think this implementation should be more or less working.
MallocSlabBuffer's implementation is kinda scary looking, but it's actually not that complicated. Here's a rundown of how it works:
It stores a set of memory "slabs" of size slab_size (default 1 megabyte).
the current field is the currently active pointer that a newly @alloc'd object will aquire, if the object fits between current and slab_end.
If the object does not fit between current and slab_end, but is smaller than slab_size, we'll malloc a new slab, and add it to slabs (reallocating the slabs pointer if there's not enough room, as determined by max_slabs_length) and then set that thing as the current pointer, and provide that to the object.
If the object is bigger than slab_size, then we malloc a pointer of the requested size, and add it to the custom_slabs pointer (also reallocating that pointer if necessary), leaving current and slab_end unchanged.
When a @no_escape block ends, we reset current, and slab_end to their old values, and if slabs or custom_slabs have grown, we free all the pointers that weren't present before, and reset their respective lengths (but not max_sizes).
For this PR, I've opted to not rock the boat too much regarding the existing APIs and infrastructure, but I do think that this should be the default way dynamic memory is dealt with in StaticTools because it's very efficient, more user friendly than direct malloc/free, and quite flexible.
In the future then (or in this PR if we want) we'll need to figure out a good path to dealing with the zoo of different types here, and migrating the test suite, and modifying the documentation to nudge people more towards MallocSlabBuffer.
Regarding types, one potential design question is that Bumper defaults to returning PtrArrays from StrideArraysCore.jl, whereas in StaticTools.jl stuff is built around MallocArray, which is structurally the same, but has a different name and some different methods like free. There's a couple ways forward we could take:
leave it. It's kinda messy with similar but different types floating around everywhere, but maybe fine.
Make @alloc produce a MallocArray instead of a PtrArray when used on a MallocSlabBuffer. This could work fine, but I'm not a fan of it being named MallocArray if it's being handled by the MallocSlabBuffer instead of malloc directly
Replace MallocArray(...) with mallocarray(...)::PtrArray since this change would bring StrideArraysCore.jl into StaticTools.jl anyways, and PtrArray has some nice advantages like really good support from LoopVectorization.jl etc. It also has some potential annoyances though, like a shitload of type parameters:
julia> buf = MallocSlabBuffer();
julia> @no_escape buf begin
typeof(@alloc(Int, 10))
end
PtrArray{Int64, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{StaticInt{1}}} (alias for StrideArraysCore.AbstractPtrArray{Int64, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{Static.StaticInt{1}}, Int64})
This might cause some testing problems on v1.9+ without https://github.com/MasonProtter/Bumper.jl/pull/31 being merged, because of a circular dependency with StaticCompiler.jl, but other than that I think this implementation should be more or less working.
It's basically a re-implementation of https://github.com/MasonProtter/Bumper.jl/blob/515a4dd405de71da6621dd7b72841c6c794f2c2c/src/SlabBuffer.jl except it uses no mutable types and no
Vector
s, implementing everything via pointers directly, including the resizing of the slab queues.MallocSlabBuffer's implementation is kinda scary looking, but it's actually not that complicated. Here's a rundown of how it works:
slab_size
(default 1 megabyte).current
field is the currently active pointer that a newly@alloc
'd object will aquire, if the object fits betweencurrent
andslab_end
.current
andslab_end
, but is smaller thanslab_size
, we'llmalloc
a new slab, and add it toslabs
(reallocating theslabs
pointer if there's not enough room, as determined bymax_slabs_length
) and then set that thing as thecurrent
pointer, and provide that to the object.slab_size
, then wemalloc
a pointer of the requested size, and add it to thecustom_slabs
pointer (also reallocating that pointer if necessary), leavingcurrent
andslab_end
unchanged.When a
@no_escape
block ends, we resetcurrent
, andslab_end
to their old values, and ifslabs
orcustom_slabs
have grown, wefree
all the pointers that weren't present before, and reset their respectivelength
s (but notmax_size
s).For this PR, I've opted to not rock the boat too much regarding the existing APIs and infrastructure, but I do think that this should be the default way dynamic memory is dealt with in StaticTools because it's very efficient, more user friendly than direct
malloc
/free
, and quite flexible.In the future then (or in this PR if we want) we'll need to figure out a good path to dealing with the zoo of different types here, and migrating the test suite, and modifying the documentation to nudge people more towards
MallocSlabBuffer
.Regarding types, one potential design question is that Bumper defaults to returning
PtrArray
s from StrideArraysCore.jl, whereas in StaticTools.jl stuff is built aroundMallocArray
, which is structurally the same, but has a different name and some different methods likefree
. There's a couple ways forward we could take:@alloc
produce aMallocArray
instead of aPtrArray
when used on aMallocSlabBuffer
. This could work fine, but I'm not a fan of it being namedMallocArray
if it's being handled by theMallocSlabBuffer
instead ofmalloc
directlyMallocArray(...)
withmallocarray(...)::PtrArray
since this change would bring StrideArraysCore.jl into StaticTools.jl anyways, andPtrArray
has some nice advantages like really good support from LoopVectorization.jl etc. It also has some potential annoyances though, like a shitload of type parameters:julia> @no_escape buf begin typeof(@alloc(Int, 10)) end PtrArray{Int64, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{StaticInt{1}}} (alias for StrideArraysCore.AbstractPtrArray{Int64, 1, (1,), Tuple{Int64}, Tuple{Nothing}, Tuple{Static.StaticInt{1}}, Int64})