stephenrkell / liballocs

Meta-level run-time services for Unix processes... a.k.a. dragging Unix into the 1980s
http://humprog.org/~stephen/research/liballocs
Other
217 stars 26 forks source link

Both allocators and wrappers should have run-time identity #21

Open stephenrkell opened 5 years ago

stephenrkell commented 5 years ago

Currently there is some confusion about what an "allocator" is, owing to the presence of struct allocator instances that actually cover more than one allocator, such as __generic_malloc_allocator. All these struct allocator objects are statically defined -- we never generate them. If we could generate these and related structures, we could maintain a richer run-time model of allocators, allocator wrappers and allocation call sites. This would be useful for automating some meta-level policy, such as deleting an allocation when it's no longer needed. Currently the per-allocator free call is too stupid to allow this, as it can't identify which is the right free function to call when there are many alternatives (in cases where freeing and finalisation are baked into the same operation, for example).

Currently, link-time code generation (using the macros in tools/stubgen.h) is used to wrap each allocator wrapper (yes, two levels of wrapping) and also to wrap any linked-in definitions of malloc (all in allocscompilerwrapper.py). All this is a big mess that needs rationalising.

stephenrkell commented 5 years ago

It may be possible to eliminate the wrapping of allocator wrappers, as discussed in #20 .

stephenrkell commented 5 years ago

Pithy summary of how malloc should be callee-instrumented: a global, preload-interposable malloc should not be callee-wrapped (because we will preload-interpose it to do the same thing). Others should be. Ditto for other malloc-family functions.

What about a malloc that has both global-interposable alias and another alias? We should callee-wrap the local alias. This will mean that the two are no longer aliased. That is fine.

I think all this should be driven from one big config file or very specialised/hackable script, rather than hard-coded logic buried somewhere in our Python compile/link-wrapper scripts as at present. Then, it can be given a clean semantics, and addition of custom allocators or allocator wrappers can be explained uniformly in terms of addition to that file/scripts.

stephenrkell commented 3 years ago

Another way to think about the 'config file' idea is to put the built-in allocators and LIBALLOCS_* environment variables on the same level.

Roughly the level at which this needs to work is the link map... from a pooled description of what needs callee-wrapping, we should be able to explain the difference between the actual link map (including stubs linked in) and the vanilla no-liballocs link map.

We can perhaps turn the env vars into a config file that is merged with the 'built-in' config, then use the config file to produce linker options. The logic for all this belongs in the gold plugin.

stephenrkell commented 3 years ago

What do we do when final-linking a DSO that contains a 'malloc'? It's preemptible but may or may not be the global malloc, so may or may not need the callee wrappers. Probably it's necessary to create a local alias, then ensure any locally-bound reference (maybe from protected visibility, maybe from -Bsymbolic) actually goes to the alias, which does get the callee wrapper. So the global 'malloc' (and friends) are not used internally by the object, and will only be used externally if it does indeed 'win' the contention for that global symbol.