make_precise should instead be allocator-level operations

stephenrkell commented 3 years ago

Creating a 'precise' struct that ends with a flexible array member is a ballache. You have to create the array type, then create the struct type. The struct type's make_precise function should in theory do this. But if the flexible array member is many layers down, the number of created types and make_precise functions starts to add up. Currently we don't generate these functions, only the ones for the array type.

(Once no array type has a definite length (#34), structs that contain arrays will still have definite length, except for flexible array members, so that does not change things here.)

Given that we are rethinking arrays, we have a chance to rethink make_precise in general. We want it for:

stack frames, but we could instead rely on the stackframe allocator to mediate the uniqtypes (there is no reason why frames need to have an overall struct type)
unions? read/write is a mess at present... for read-validity we have not_simultaneous and may_be_invalid. But maybe we want read-validity as a separate concept orthogonal to uniqtype? i.e. at the allocator level. If so it could cover struct padding, inter-allocation dead space, genuinely read-trapping regions, sparseness that should not be disturbed (?), and so on.
- 'temporally discriminated unions', e.g. inregs/outregs, hence why it takes an mcontext_t. But for these maybe we need to do the write-tracking thing, trap writes and keep last-written shadow state
- explicitly explicitly discriminated unions (like uniqtype itself), variant records etc... the idea is that we can compiling a hypothetical DWARF expression, describing the discriminator semantics, into a make_precise function. It seems reasonable that a variant record or union would have such a piece of code (again logically/hypothetically speaking). So if we get rid of make_precise, where should it live? Perhaps as a special 'related' entry... composites already have N+1 relateds, where the extra one is the member names. We can decree that not_simultaneous composites have a function that returns read-validity... and write-validity? Or maybe a "do write to member N" memcpy-like function that respects invariants (N must be the index of a may_be_invalid member).
Can we also capture "tagvecs" a.k.a. _DYNAMIC or auxv-style arrays this way? We can view them as composites where a given member may be present, absent or even present >1 time. It would be better to view these as an allocator. This may require us to relax our view of uniqtypes as being only at the 'terminal' layer of allocation, i.e. here we can access the arena as an array of Elf64_Dyn entries or as an allocator managing a packed pool of discrete / singleton Elf64_Dyn entries.

We seem to want

struct composite_member_rw_funcs
{
    // we want this to return a bitmask in the common case
    // it is either ('address-boxing')
    //      - if vas.h tells us is not a valid user address,
    //           the caller should then mask it by ((1ull<<(nmembers))-1u)
    //           and the result is a bitmask conveying definedness of the
    //           first nmembers members
    //      - if vas.h tells us it is a valid user address,
    //           it points to a bit vector (le? be?) of nmemb entries,
    //               read-valid in whole words (i.e. rounded up to the word size)
    //           resource management? use TLS? yes I think TLS is best.
    //           the function that writes it can do  static __thread intptr_t mask;
    //           and return &mask (after writing the bits to it);
    intptr_t (*get_read_validity_mask)(struct uniqtype *, void *base, mcontext_t *ctxt);
    void     (*write_member)          (struct uniqtype *, void *base, mcontext_t *ctxt,
                                      unsigned memb_idx, const void *src);
};