Open stephenrkell opened 5 years ago
Perhaps another way to think of this: declare the code that realises an allocator, rather than the entry points. Then, when we see that code on the stack, we know that that allocator is doing something. There is still some ambiguity about what it is doing, e.g. obtaining memory for private bookkeeping versus obtaining memory for parcelling out to clients. We can of course observe what memory is parcelled out, but we need some kind of wrapper or other instrumentation.
Also, on the subject of observing the first cast (or the first write), it's possible that the default liballocs instrumentation should perhaps be instrumenting these -- witness Guillaume's approach to CPython module, which adds a pointer write barrier. Of course, like our other instrumentations, they should be done at the binary level.
Source-level metadata, for writes and cast sites, could be added cheaply in among the other static metadata. We can think of a "cast site" as an assertion point... raising the prospect of a "binary libcrunch".
With the ongoing overhaul of static metadata, there's an opportunity to design towards a more dynamic heuristic for identifying allocator functions. Done carefully, this should not impact performance in any measurable way, and should increase usability.
For example, we already classify all indirect call sites as "possibly calling allocators", if their signatures satisfy a certain property. What if we included any call site that receives one or more arguments having a sizeofness? Our allocator call site table will become bigger, but perhaps not unmanageably so.
One use of the current instrumentation is to set
__current_allocsite
to some value. We already want to generalise this so that it is merely__outermost_allocsite
, and so auxiliary allocations made during a prima facie allocation are still classified/typed separately. In this form, the only value__outermost_allocsite
brings is telling us we can stop walking up the stack. How often do we make use of this?This is connected with the desire in #11 to eliminate source-level instrumentation. An entirely online approach would be better even than binary instrumentation. Can a bootstrapping approach work? Whenever
mmap()
is called, any of its callers is considered a possible allocation function. However, this won't catch all allocation events, because most of the time they won't call down to mmap. This is the main difference between suballocators and wrappers.I think a fully automatic online treatment of malloc wrappers is feasible, provided that malloc itself is dynamically interposable. Other allocators probably can't be handled automatically with much generality... ultimately, an implementation of the
struct allocator
functions needs to come from somewhere. We could perhaps still do more to make a sensible guess, e.g. by considering the subsequence of the call chain between the mmap entry and the nearest enclosing call having a classified call site. Probably one idea is to issue a warning on the console when this happens, and see whether the output looks sensible.With libcrunch in the mix, we have yet another mechanism for avoiding up-front classification, which is letting the first cast decide the type. We can't directly use this within liballocs... it's for the instrumentation to call
set_type
or whatever, if it wants to. But this might make some of our run-time efforts redundant, so perhaps they should be disable-able somehow.