Open stephenrkell opened 4 years ago
With the latest libsystrap, we can now hook clone()
if that's a good way to do this. And we can of course hook arch_prctl()
. I'm a bit fuzzy on when new TLS blocks get allocated... could be before or after the clone. But either way, this seems doable.
One quirk is that a TLS block can be logically extended by dlopen()
. The block is not reallocated... rather, dynamically loaded modules' regions are discontiguous, and one must indirect through the DTV (a per-thread vector of pointers, one pointer per DSO) to find those thread-local variables. Also remember that the allocation is lazy.
Note that DTVs themselves can be reallocated, and this may happen during dynamic loading.
We can think of a mmap'd chunk participating in TLS as representing a particular range of one or more DTV entries, for a particular thread. Each DTV entry corresponds to the thread-local segment of a single binary. So, supposing we compute static metadata for these just like we do non-thread-locals, but instead of by vaddr it is indexed by offset from the DTV block base. Our per-bigalloc metadata for the chunk will be mostly concerned with recording this sequence (and probably the thread ID / its TCB base address).
We can perhaps use a structure similar to the one we use for allocation sites -- in particular how we group together multiple DSOs' info into a single coherent identity space, despite each having a local indexing scheme.
I wonder if trapping set_thread_area()
is a sane way to do this (on Linux, on x86...). It is less hairy than trapping clone()
.
We are missing support for thread-local storage.
This allocator is implemented inside the dynamic linker and is much like a static allocator, but each thread gets its own segment, for each library defining TLS-storage symbols. We need to create bigallocs for these areas as threads are created, and index them more-or-less as we handle static segments. (There is no point starting this until the deep-static-allocs branch is merged.)