Open MalhotraPulak opened 3 years ago
Tcache implementation in malloc.c
typedef struct tcache_perthread_struct
{
char counts[TCACHE_MAX_BINS];
tcache_entry *entries[TCACHE_MAX_BINS];
} tcache_perthread_struct;
static __thread bool tcache_shutting_down = false;
static __thread tcache_perthread_struct *tcache = NULL;
The core issue here is to get the base address of tcache
struct of typetcache_perthread_struct
for each thread. Once you have that you can easily print the bins using that struct.
gef
solve this issue? def find_tcache():
"""Return the location of the current thread's tcache."""
try:
# For multithreaded binaries, the tcache symbol (in thread local
# storage) will give us the correct address.
tcache_addr = gdb.parse_and_eval("(void *) tcache")
except gdb.error:
# In binaries not linked with pthread (and therefore there is only
# one thread), we can't use the tcache symbol, but we can guess the
# correct address because the tcache is consistently the first
# allocation in the main arena.
heap_base = HeapBaseFunction.heap_base()
if heap_base is None:
err("No heap section")
return 0x0
tcache_addr = heap_base + 0x10
return tcache_addr
gef
calls find_tcache
from each thread to get the base address of tcache
pointer. This function uses the (void *) tcache
symbol which gives us the address.
A few points to note about the tcache
symbol:
tcache
symbol is not part of the executable.tcache
symbol is loaded from shared library libc
at run time.Libc
is usually stripped but tcache
symbol still loads. This is because gdb loads the debug symbols of libc from /usr/lib/debug/lib/x86_64-linux-gnu/libc-xxx.so
. If this debug info file is not present this method fails.tcache
symbol it is generally a small number like 0x40
. This is not the address of the variable but just the offset of the variable from the TLS block for the shared library of Glibc. Lets call this offset tcache_offset
. dtv[module_id] + tcache_offset
.dtv
is a dynamic thread vector that is part of Thread control block (TCB) of a given thread. TCB is pointed to by the value of fs
register. module_id
is the id assigned to a shared library by linker at runtime. module_id
is always 1
for the main executable and increments(?) for other shared libraries. We are concerned with getting the module_id
for libc. We need three main things to solve this issue:
dtv
: This can be found by deferencing the pointer in fs
register as a pthread struct. tcache_offset
: Load this from the debug info file of libc as gdb
does.module_id
: This one I am not sure about. There are few ways which might work. First, traverse link-map structs and may be they are in same order as module id. Second, find the tcache_offset
that we know in the GOT and assume the integer before it is the as the module_id
. Third, there is a way to detect library load from ptrace so we can get order from there and module id might be in the same order. This issue has been automatically marked as stale because it has not had recent activity. Considering a lot has probably changed since its creation, we kindly ask you to check again if the issue you reported is still relevant in the current version of rizin. If it is, update this issue with a comment, otherwise it will be automatically closed if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. Considering a lot has probably changed since its creation, we kindly ask you to check again if the issue you reported is still relevant in the current version of rizin. If it is, update this issue with a comment, otherwise it will be automatically closed if no further activity occurs. Thank you for your contributions.
Is your feature request related to a problem? Please describe. In Glibc heap, a different tcache is created per thread. Rizin uses Arenas to find and parse the tcaches. This consequently leads to Rizin not displaying all the tcaches (
dmht
command) when the number of threads is greater than the number of arenas i.e. multiple threads share an Arena.Here is an example binary:
The binary above spawns 100 threads (101 if you include main thread) and populates the tcache in each thread. The output for Rizin:
Rizin reports total 48 arenas which is accurate. (6 cores * 8) Now output for
dmht
command:Rizin finds 141 chunks across 47 Tcache bins. (3 chunks per bins and 1 bin per thread). This is incorrect as there were total 100 threads created and each thread would have its own tcache. We can verify this using
dev
build of GEF which recently fixed an issue like this.Describe the solution you'd like A GDB like output is expected where 100 populated tcache bins are found. Printing the thread ID instead of arena address also seems better to convey to the user that tcache belongs to threads not arenas. As I am working on Cutter Heap Viewer at the moment, I would give this issue a try right now and resolve this before I refactor the tcache part and implement tcache in Cutter heap viewer.