stephenrkell / liballocs

Meta-level run-time services for Unix processes... a.k.a. dragging Unix into the 1980s
http://humprog.org/~stephen/research/liballocs
Other
213 stars 25 forks source link

New service: stable allocation identities #86

Open stephenrkell opened 6 months ago

stephenrkell commented 6 months ago

A useful primitive for a lot of tool and diagnostic applications would be "stable allocation identities": instead of a trace or dump that has inscrutable 0xdeadbeef-esque numbers that change each time, a stable identity could say something like site-myfunc-2-thread-3-1+0x22. This identifier would be the same each execution, for the allocation created at the same point in the dynamic instruction trace.

I'm anticipating that

A good test would be an ltrace-like tracer, implemented in-process, but where we print addresses as stable identifiers. It should give the same output for any run, even for multithreaded programs if we discard ordering (e.g. sort the output lines).

Note that the liballocs API already allows for allocations to have names. However, these names are not the same thing -- they are (1) optional (only for subobjects, symbols etc) and (2) non-unique (directly using the symbol name, subobject name, etc). By contrast, a stable allocation identity should be globally unique across a whole execution, and every allocation should have one (if asked for one).

stephenrkell commented 6 months ago

This is related to the need faced by my editable-assembly-generating example client (see examples/ or #88) to generate symbolic names for any referenced position. That is interesting because what's being named there is a position in the allocation containment tree, not just an address. If we have an address and a pointed-to type, we can identify a particular tree node, not just an address-equal slice through the tree. (This is the insight between my as-yet-unpublished bounds-checking work.)

So maybe the service could name a tree node not just an address? Some defaulting logic could break ties if the client doesn't know which node it wants.

Incidentally, a better API for navigating these tree slices would replace find_matching_subobject and first_subobject_spanning and walk_subobjects_spanning (others?).