liballocs could usefully track reachability

stephenrkell commented 3 years ago

Reachability among bigallocs would be very useful to track, e.g. for implementing garbage collectors while tracking what is escaping from their view. It should not be expensive if done carefully.

It may be necessary to track an overapproximation ('may reach') and underapproximation ('must reach') separately from each other, rather than tracking exact truth.

It may be useful to track "immediate reachability", i.e. points-to, and transitive reachability separately.

One relevant parameter is whether a given text segment (or other text-containing bigalloc) is write-barriered by its liballocs-aware allocator, hence sending liballocs pointer-write callbacks when new 'interesting' reachability edges of various kinds. If it isn't, then instead, new overapproximate reachability edges must be induced. E.g. for any bigalloc reachable by that text segment, any writably reachable bigalloc may also now reach that bigalloc (i.e. the code may write a pointer into it). See the Bertholon & Kell VMIL '19 paper for more of this flavour.

For static inter-DSO reachability, ld.so auditing may be useful (try man rtld-audit).

stephenrkell commented 3 weeks ago

What are the general rules at work here? First attempt: bigalloc A may point to bigalloc B when...

a relocation in bigalloc A points to a symbol in/under bigalloc B (i.e. symbols themselves are rarely bigallocs)
bigalloc A is a thread stack and the thread does a call to a text bigalloc that can reach (transitively) bigalloc B
bigalloc A is a thread stack and the thread does a read from a data or text bigalloc that points to (immediately) bigalloc B
any thread that can reach B (transitively) does a write to bigalloc A

I think these are not quite right. These rules are very imprecise, and assume that data passes through thread bigallocs in a way that accumulates within the thread. Each thread bigalloc is permanently tainted with knowledge of, i.e. point to, any bigalloc whose address plausibly passes through it. That feels very imprecise, because in reality a thread will often not retain any knowledge of the addresses that pass through it as on-stack temporaries. And it is ignoring the many abstraction barriers that exist on any stack -- callees are conceptually private from their callers. So maybe our primitive should be frames, not stacks? The problem is that frames are rarely bigallocs. Return to this below.

(It is also assuming no single instruction copies from memory to memory without using a stack-or-register intermediary. Is this accurate? Some ISAs may invalidate this.)

Bigallocs that are not mutable cannot gain additional direct points-to edges. They may acquire some points-to edges from relocations, conceptually at their time of creation. However, they may gain in transitive points-to, i.e. reach, owing to gains anywhere in their transitive closure of points-to.

How does bigalloc containment factor in? Does it follow that if bigalloc B points to bigalloc C, then bigalloc A that is an ancestor of B also points to bigalloc C? I want to say no, but I think it depends on the semantics of the parent relationships. Some children are opaque to their parents and some are transparent. E.g. a symbol is transparent to a segment, but a heap chunk is not transparent to its arena.

Can we avoid the special treatment of relocations and instead model relocation purely as the action of a dynamic linking stack frame? When a segment is created, it is pointed to by the dynamic linking frame that mmaps it. It should not be the case that all segments are reachable from all other segments just because the dynamic linker itself can reach them all. They are stored in link map structures, probably under a single heap arena, but each link map allocation is opaque to the arena itself. It is only because the dynamic linker reaches them all individually that it can reach all the segments.

Perhaps we need special rules for frames. Frames are initially empty, i.e. point to only their calling frame (stack pointer) and active text segment (initial program counter) and return text segment (return address). However, they may receive pointers from callers or callees, or from writes if their own storage is address-taken. What's good about focusing on frames is that there is a fixed association with the text (code) running in a frame. Depending on what the code does, the frame may have a wider or narrower envelope of behaviour in terms of what it does to the reachability relation. We could imagine computing an "information-flow summary" per function. What would this include?

There is an analogy with interprocedural sizeofness analysis, where our idea is to do a flow-insensitive(?) information flow analysis for every function. Let's revisit the reachability question once we've done that. It's likely that a similar approximation can tell us how each frame propagates reach. Then our task is to figure out how to coarsen this to the bigalloc level without losing too much precision or causing significant slowdown.

Remember we are basically doing an online points-to analysis. It is not a traditional points-to analysis because it is not answering questions about pointer expressions in code, i.e. equivalence classes of real run-time pointers, but rather, about real concrete run-time pointer-containing state chunks (bigallocs) themselves. It is still an abstraction though, just in a different way: in that it is trying to keep a running approximation (summary) so as to avoid scanning the whole of memory, not to (say) answer all-paths/all-inputs questions about a particular program.

stephenrkell commented 3 weeks ago

Not sure how useful this is but we can imagine a matrix:

                        |       definitely        |       overapproximatingly
----------------------------------------------------------------------------------
directly points to      |    object scan          |      points-to summary
reaches                 |    trace through heap   |     reachability summary

Why are roots roots? Because they are definitely immediately pointed to by one or more mutator thread contexts. (More or less! We overlook the distinction between text segment and data segment... an explicit pointer to the data segment might not be present in a thread context.) So they are not quite "roots" so much as branches low in the tree.

stephenrkell commented 3 weeks ago

I think the strongest motivation for this thought experiment is to identify the "real" roots of a particular heap, in a process with many stacks and many data and text segments.

stephenrkell / liballocs

liballocs could usefully track reachability #51