Modelling functions, or abstraction levels generally

stephenrkell commented 5 months ago

The demo in the README shows that an object of function type shows up as having size zero, even though the function symbol will have a size equal to the number of instruction bytes.

The size of zero is deliberate, because all functions of the same signature should have the same function uniqtype. If we had to have a separate size, this doesn't scale (causes uniqtype explosion), and is wrong anyway because it is mixing distinct levels of abstraction (if you're using an allocation as a function, you're not viewing it as a bunch of instruction bytes). However, the size of zero can appear a bit odd... at the least, I should update the README to remark on this.

Really there is a level of abstraction below the function type, which views the data as instructions. We even have an allocator for modelling such structures: it's a "packed sequence" (src/allocators/packed-seq.c). The allocator maintains a bitmap of element starts (here, instruction starts) and has a function that can decode each variable-length element, returning type and size information for it.

This idea of there being uniqtypes (functions) but also a layer of allocation below that (packed sequences of instructions) doesn't fit into the current design, which assumes uniqtypes are attached only at the leaf level. Somehow we need to relax our design or bolt on some new feature for descending a level.

This might relate to #16 about meta-completeness. E.g. if we want our chunk inserts to themselves be queryable, we seem to have some awkward containment relation (a malloc chunk has a certain type but also contains a bunch of bytes from another, higher meta-level, i.e. that of its containing allocator).

stephenrkell commented 5 months ago

Related: a promoted-to-bigalloc chunk might already have a type. E.g. if I allocate an array of ints, then use that array to suballocate from, we can get this situation. Does the promoted bigalloc retain its type in any real sense? It's only in C that we can't do that with int and char is special... we may not want to enshrine such a rule. Maybe we can have multiple levels of uniqtypes, if they respect certain rules? It's like C-style containment in a struct is 'transparent' whereas we might actually also want 'opaque' containment, such as how functions consist of a packed sequence of instructions. It's starting to feel module-systemy in an almost ML-ish way, all of a sudden.

stephenrkell commented 5 months ago

Also relevant: #53, and that uniqtypes are really allocators.

stephenrkell commented 5 months ago

One could argue that every "typed" allocation or range thereof has a lower-level view which is its representation bytes. This might be latent in a memcpy checker in libcrunch, for example. The "packed sequence of instructions" view is an intermediate one, below the "default" function view and above the bottom-level bytes one.

stephenrkell commented 1 month ago

This is rapidly becoming a special case of #53.

stephenrkell / liballocs

Modelling functions, or abstraction levels generally #82