GJDuck / e9patch

A powerful static binary rewriting tool
GNU General Public License v3.0
964 stars 65 forks source link

Can you add stdlib support for malloc_usable_size() #45

Closed restarre closed 2 years ago

restarre commented 2 years ago

Hi, I am using e9patch/e9tool/e9afl to insert my binary to detect the allocation of heap, but I found that e9patch did not support the usage of malloc_usable_size. Thus I truned to "dlopen dlsym dlsym" to load a shared library. The shared library includes a function which calls malloc_usable_size, however, I encounted a segmentation fault when executing the function loaded by "dlsym" which using malloc_usable_size. So far, I do not how to solve this problem. Could you add stdlib support for malloc_usable_size() ? Or , Could you give a some advise about this question?

GJDuck commented 2 years ago

Firstly, the stdlib implementation of malloc is independent of the glibc version, and they cannot be mixed. So do you want to:

  1. call glibc's malloc_usable_size() on a glibc malloc'ed pointer; or
  2. implement a stdlib version of malloc_usable_size() that can be used on stdlib malloc'ed pointers?

If it is the first one then using the stdlib dlopen is the correct approach. However, as noted in the stdlib implementation:

    BE AWARE OF ABI ISSUES.  External library code probably uses the SYSV
    ABI meaning that the program may crash if you try and call it from a
    clean call.  To avoid this, the dlcall() helper function may be used
    to safely switch to the SYSV ABI.

Basically, you cannot call functions returned by the stdlib dlsym directly. Instead, you must wrap the call using dlcall, e.g.:

    result = dlcall(malloc_usable_size_ptr, p);

Be aware that dlcall is a relatively expensive operation as it must save/restore the x64 extended register state.

restarre commented 2 years ago

Actually, I tried to use dlcall and even the not recommended way that directly call the function. They all encountered they segmentation fault. Thus, I do not know what to do? Below are command I used: ./e9tool -M 'call and target == &malloc' -P 'inc(&rax)@test' ./binary

The code used to create the shared library

#include<malloc.h>
#include<stdio.h>
void get_allocated_size(void *ptr,int used)
{
    used =  malloc_usable_size(ptr);
}

Information about the segmentation fault

Program received signal SIGSEGV, Segmentation fault.
musable (mem=0x555555400736 <main>) at malloc.c:4855
4855    malloc.c: No such file or directory.
(gdb) bt
#0  musable (mem=0x555555400736 <main>) at malloc.c:4855
#1  __malloc_usable_size (m=0x555555400736 <main>) at malloc.c:4867
#2  0x00007ffff7fc5138 in get_allocated_size ()
   from /home/kalo/e9afl-master/dynamic/libmalloc.so
#3  0x00005555c54024ba in ?? ()
#4  0x0000000000000000 in ?? ()
GJDuck commented 2 years ago

OK, I think it is a different problem:

Below are command I used: ./e9tool -M 'call and target == &malloc' -P 'inc(&rax)@test' ./binary

This will insert the inc(&rax) call before the call instruction, so %rax register will just contain a junk value, and this likely explains the crash.

Fixing it is a bit tricky. E9tool supports an after annotation that will insert the instruction after the instruction. However, this does not work for control-flow-transfer instructions such as call, since there is no "after" in this case.

One way to solve it would to instrument the next instruction after the call, e.g.:

    ./e9tool -M 'I[-1].call and I[-1].target == &malloc' -P 'inc(&rax)@test' ./binary

An alternative would be to use the replace annotation, which replaces the matching instruction entirely:

    ./e9tool -M 'call and target == &malloc' -P 'replace inc(&rax)@test' ./binary

Next, update inc() to call glbc malloc yourself, and store the resulting value in %rax in addition to any instrumentation.

restarre commented 2 years ago

Yes, your are right that I transfered a wrong address to the dynamic library. So far I think I now how to fix this problem. I have some questions about how to use this tool: If <malloc@plt> is included in the .plt.sec section, it seems that it would not match the call of malloc function(using -M 'call and target == &malloc'). If I using the tool in wrong way? And could you explain that how e9patch/e9tool match function call instruction according to name of the function? AFAIK, it can match the target of call instruction with information in the PLTInfo. Is there another way to match function call instruction with its name?

GJDuck commented 2 years ago

I think the original issue is solved.

if <malloc@plt> is included in the .plt.sec section, it seems that it would not match the call of malloc function

I wasn't aware of .plt.sec, but it seems that support would need to be added else it will not match.

And could you explain that how e9patch/e9tool match function call instruction according to name of the function?

Basically, it parses the name malloc and tries to match it against some ELF "object", which can be a PLT entry, symbol, section name, etc., in some order.

restarre commented 2 years ago

Thanks for your answer. I have last two question. First, how to call a function included in the binary in the E9Tool plugin? (the binary is compiled by e9compile.sh). Since I am a novice in this area, I do not knot how to get the right michine code used in the plugin? Is there a tool convert assembly code to machine code?

GJDuck commented 2 years ago

First, how to call a function included in the binary in the E9Tool plugin?

It is a bit complicated. You can load the binary via parseELF() and sendELFFileMessage(), then you need to find the corresponding address of the function (you can use getSymbol() to find it). Then you need to emit instructions to set up the args and actually call the function. This requires x64 and ABI knowlege.

I do not knot how to get the right michine code used in the plugin?

For simple applications, you can just write the instructions you want (in x64 assembly) into a file (e.g., file.s) then compile:

    gcc -c file.s
    objdump file.o

Then copy the corresponding bytes. It gets a lot more complicated if you want to vary/specialize the instructions per trampoline. For example, see https://github.com/GJDuck/RedFat/blob/master/RedFatPlugin.cpp for an advanced example.

restarre commented 2 years ago

Sorry to bother you again. I want to use more than one trampoline template in the plugin, but I find that it is hard to do this with one plugin. Can a plugin only insert one trampoline template? Do you have any suggestions for this?

GJDuck commented 2 years ago

Can a plugin only insert one trampoline template? Do you have any suggestions for this?

It should be possible. During initialization, you can emit multiple trampoline templates (e.g., using sendTrampolineMessage()) with different names (e.g., $name1, $name2, etc.).

Then during the patching:

  1. In e9_plugin_code_v1(), emit a single macro name (e.g., "$xxx").
  2. In e9_plugin_patch_v1(), emit a key-value pair that exands $xxx to the corresponding $names that should be applied to the matching instruction, e.g.:

    "$xxx":["$name1", "$name4"]

I've not tested this idea, however.