Arena stack, strings and data ownership

Here's an architecture I've been throwing around in my head for a while, and I finally have a real use-case for it... but that use-case causes some other problems.

Arena (Allocator) Stack

There are two API approaches to allocators:

Pass the allocator to the function explicitly. I think it should typically be the first argument i.e. foo(myAllocator, 1, 2, 3), with a self-like argument being the only thing that takes precedence.
Implicitly use a global allocator, i.e. foo(1, 2, 3) (myAllocator is taken from some global variable).

In C, most API's require you to pre-allocate the space required which can be very annoying. For ones that don't require this, they use the global allocator. To my knowledge there isn't really a concept of a system allocator interface -- since there isn't really the concept of interfaces in C. I think this is one of C's biggest shortfalls.

In fngi we have the Arena interface. My intent is that most fngi functions that need to allocate should require you to pass in the Arena to use. Any pointers they return should then be allocated from that Arena (and only that Arena).

However, there are cases where a "global" allocator is extremely useful for purely ergonomic reasons. Such is the case for things that need to be allocated in the course of compilation -- which includes global variables and especially strings.

The case of strings

In fngi you can declare a global or local Str, which is really a SlcU1, like so:

struct Str [ dat:&U1, len:U2 ]
var s: Str = |hello world|

You also might want to log something, either at imm or non-imm time:

log.info(|something at non-imm time|)
$log.info(|something at imm time|)

In the general case:

at non-immediate time, |...| stores the string contents somewhere and then compiles instructions to push Slc(&somewhere, len) literals to the stack
at immediate time it stores the same values but immediately pushes Slc(&somewhere, len) to the stack.

However, we have a few challenging requirements here. First of all, we cannot just put strings in the global code buffer. Obviously it wouldn't make sense for immediates since that won't exist at run time. However, it doesn't even work for local variables, as a function would try to start executing the string unless we inserted a jump or similar (very annoying).

Also, the global case is interesting because it gets into non-declared owned memory. Globals so far have declared their memory in the type, but this global depends on memory (of dynamic size) allocated somewhere else. This could cause problems when we try to compile this to a native binary or do dead-code analysis. I will get into this in Ownership Tracking below.

We need another solution

Arena stack

I propose the use of a global thread-local "Arena stack". An Arena stack, as the name implies, is a stack (really SLL) of arenas.

Common operations include:

pushing a new arena before calling a function or executing some algorithm
allocating memory on the top arena
popping the top arena and moving memory/values to the new top, then dropping the former top.

The arena stack allows for implicit global allocation while still preserving the value of arenas -- you can drop a whole arena when you are done with it (and all values are moved out), thus preventing memory fragmentation.

Basically there would be a base "root Arena" where the data of globals and locals are stored. This applies to more than just strings -- any allocations that need to happen for "static" values at compile time would happen in the root arena.

For syn functions (like log.info that know they are being run at immediate time and only need temporary storage, they would push an arena onto the stack before compiling the next token, then drop that arena.

Ownership Tracking

Now we get into an advanced topic. This isn't necessary for the MVP but is necessary to actually generate native code or do dead-code analysis... something we really want, so the architecture should support it.

To recap: a global variable can depend on dynamically sized memory. To do linking we must be able to move this memory around -- we must know not only who owns what data, but where (aka what field/offset) owns that data.

This is a definite challenge. Until now I have not even considered tracking something like this. I now see that it is essential though.

Arena Tracking

For this to work, we need several things:

The arena must accept the result: &&Any pointer, so that it can answer the question of what "field offset" will own allocated data.
The arena must have the option of keeping a SLL of this information. It would look like struct Ownership [ next: &Ownership, owner: Ref, data: SlcU1 ].

So our arena would look like:

role Arena [
  \ allocate memory, setting it's pointer to out.
  meth alloc(self, out:&&Any, sz:S, alignment:U2) do;

  \ set or clear ownership tracking. Some arenas may NOT support this.
  \ returns the former owner (typically an error if non-null).
  meth track(self, o: &&Ownership) -> &Ownership do;
]

With the owner set, all allocations and their destination can be tracked -- assuming that when they allocate out is appropriately set to the global being built. To ensure this invariant, types that are allowed to be constructed as globals must have a bit set (VAR_GLOBAL_SAFE) to assert that yes, their constructors correctly follow the allocator conventions so that ownership can be determined by the linker.

The Ownership information will be attached to the TyVar for the linker to use.

Type Tracking

Perhaps simpler is for the type to be inspected to find the pointers it owns -- and then recurse. This requires no major changes to the allocator API -- although as we shall see, it does affect the Allocator implementation.

The primary problem here is dynamically sized types. The primary solution is to only allow static memory in arenas that support tracking the size of allocations. This is because we need to move data around so therefore we need to know the size of all pointers, even embedded pointers.

The simplest way to do this is to have Arenas have a track(self, sz: bool) method to enable/disable tracking the size of the pointer. Size would then be tracked in the previous 4 bytes of the pointer as non-aligned BE data. This has many benefits, a big one being that we can force-enable this in the var syn function and it should make most allocations "just work" without any changes by the user.

The other option is to always track size which I have to admit is rather tempting. We could use an aligned size and remove alignment from the API as well. I think investigating this would be very worthwhile. This whole thing would make shortening an allocation more difficult -- we would almost need an API just for that purpose :cry:

Pointers to internal data

Something neither of the above consider is global variables that point to data in other globals. To manage this kind of complexity the linker/deadcode will have to create a giant map of all the static memory pointers in existence and walk them to determine which ones are used. It's very similar to traditional dead-code but we have to walk types instead of functions.

This gets even more complicated for arrays! You might point to an index in an array of data -- but then the value at that index also needs to be walked to determine ownership.

Conclusion

I think it's pretty clear -- a generalized solution for ownership tracking is simply not simple enough and it is becoming more clear to me why C does not allow it.

So instead of a general solution, let's make a specialized solution: types (like Str) can implement the following API

\\ offset represents the (pointer/array) field, len represents the number of elements of that pointer/array
\\ meta is used for more exotic possibilities: nested array pointers and such.
\\ This will likely not be implemented in v1.0
struct Ownership  [ next: &Ownership  meta: U2  offset: U2  len: S ]
meth declareOwnership(&self) -> &Ownership

This can then be followed recursively by the dead-code-analyzer/native-compiler for all sub-types.

civboot / fngi