Open mephi42 opened 4 years ago
It is too costly to put PID:TID into each TLV.
Per-thread trace buffers would require accessing TLS from the IR. Each per-thread buffer should start with an MT_THREAD TLV, which contains PID:TID. But this will mess up the use-def analysis, since loads and stores will not follow the temporal order.
In Valgrind, only one thread runs as a time. But let's assume a proper multi-threaded scenario with the upcoming qemu tracer in mind.
For a single buffer, a lock-free approach may be used. lock xadd the pointer, flush on overflow, otherwise fill in the data. The other threads will continue writing concurrently.
Problem: what if the other threads are still writing during a flush? Solution: make flush wait on a semaphore, essentially parking all threads.
Problem: what if some threads are stuck in syscalls (e.g., futex) for a really long time? Solution: make them decrement a semaphore while they are executing a syscall. Note: we can also count active writers, but this is wasteful, since it needs to be done by each IRSB.
Let's use a proper counter: variable + mutex + cv for when it reaches 0 (how to do this in valgrind? pthread is most likely prohibited). It's important because of the possibility of the new thread creation at the flush time. Initially it is equal to the number of threads (and is incremented when a new thread is spawned). On a syscall entry, it's decremented. On a syscall exit, it's incremented. How to hook syscall exit in valgrind? Entry is hooked using bb->jumpkind. We also need a dirty helper for both. On a flush, it's decremented. If it becomes zero because of us, we flush and signal other threads with a condition variable that they may proceed. Otherwise we wait for the condition variable.
Problem: we need to sync syscall exit with a flush in case one is happening. Create a 2nd global (flush_in_progress)?
Problem: currently we flush after writing an entry (by observing a high watermark). Here we would definitely need to do this before. Also, flush will need to return the buffer position for use by each thread. Otherwise one thread may fill the buffer after the flush, and the other threads will need to re-flush. But this will require a loop in the IR, which is hard to organize.
per-thread trace buffers? locks?