Non-Escaping Allocations

Some allocations in Roc functions never escape to the host. For example, let's say I run this code in an update (which would use reference counting by default, not bump allocation):

decoded =
    Decode.oneOf [ foo, bar, baz ]
        |> Decode.decodeStr str

Here, we'd be allocating this list on the heap (because it's a list) and then freeing it again almost immediately. This means there's a considerable performance cost in Roc to designing an API which takes a list like this.

This is a case where a tracing GC like V8's could outperform Roc. V8 would bump-allocate this into the nursery, and then later on the collection pass would realize it's dead and not promote it out of the bump-allocated nursery.

We could do even better than V8 here if we could make the list be bump allocated. That way the allocation cost would be the same (just a pointer bump), but we wouldn't have to pay for tracing over it later. In fact, at this point Roc would be relatively close to what people do when optimizing C code: reference count only when sharing is necessary, and arena allocate everything else. (We could still reference count these just for the sake of knowing when it's safe to mutate them in-place, but we wouldn't need to bother with free once the last reference is gone.)

We could achieve this by statically determining that this list doesn't "escape" to the host, and is therefore eligible to be allocated into a "temporary arena" that the host provides for the duration of the Roc function call. That way, all "temporary allocations" like this list could be bump allocated, and then the host could reset the bump arena at the end of the call to the Roc function.

Note: We wouldn't necessarily need to do this analysis on all builds, just during optimized builds to improve performance - so it's okay if it takes longer. But maybe it's easy enough to do it always!

Whether or not a value escapes to the host can vary by call path. For example, suppose I have a function like this:

getColors = \customColor -> [ Red, Yellow, Green, customColor ]

In one code path, this function might get called in a way where the returned list never escapes to the host. In that situation, we should give it a Temporary Allocation!
In another path within the same code base, the same function might get called again - this time in a way where the returned list does escape to the host. In this situation, we must not give it a Temporary Allocation!

Inlining might save the day in a small function like this, but inlining won't always be in effect.

So how can we get both? One way to think of this is that every function's return value is polymorphic in whether it gets returned to the host. (Same goes for each field of a record, for functions that return records.) If we think of it that way, this whole analysis can be done using the same strategies we already use for type inference and monomorphization.

Regardless, this very much seems worth doing - for the API design flexibility alone. With it, there's no substantial performance reason to avoid an API design like oneOf, whereas otherwise that's a very real consideration!

Here's an idea for how this could work:

Introduce a new field in Type: a bool for escapes. By default, it's false.
We introduce a new constraint: Constraint::Escapes(Variable) which sets escapes: true and that's it.
When doing unification, we use || on the two escapes values, such that escapes: true "wins" over escapes: false.
When type-checking a platform, we set this constraint on the return types of all functions that are provided to the host, as well as the return types of all effect functions.
When monomorphizing, we incorporate the concept of whether a type escapes into its Layout, assuming the type allocates. (We don't record it one way or another for layouts of types that are Copy, since they don't get allocated regardless.) This will be used in code gen later, and it also means that two types are considered to have different layouts if they are exactly the same except that they differ in their escapes value. Thus we end up generating two different functions when necessary.
When doing code gen, we choose whether to call the Temporary Allocation function or the regular one based on whether the layout says it escapes.

I think this would work, although it might be overly conservative (that is, miss out on some opportunities to perform Temporary Allocations) specifically around unifying a value that escapes with a function argument. Unless we handle that situation differently, the unification might cause the function argument itself to be considered escapes, which doesn't really make sense.

There might be scenarios or downsides I'm not thinking of here. Also, there might be a better algorithm for doing it!

roc-lang / roc

Non-Escaping Allocations #1244