radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.47k stars 2.99k forks source link

Make C++ STL library function more simplicity #14794

Open ikey4u opened 5 years ago

ikey4u commented 5 years ago

When do reverse engineer of C++ program, STL function has very long name in asm code which causes difficult analysis. As a result, could we make the name more shorter and clear?

Take the following simple c++ program as an example:

// main.cpp
#include <iostream>
#include <vector>
#include <string>

using namespace std;

int main(int argc, char *argv[]) {
    vector<string> v;
    v.push_back("hello");
    string& x = v[0];
    v.push_back("world");
    cout << x << endl;
    return 0;
}

Compile it with g++ -Wall main.cpp -o main.

When you open the main executable, you will have the following asm codes:


[0x1000006ad]> s entry0
[0x100000640]> pd 20
            ;-- main:
            ;-- section.0.__TEXT.__text:
            ;-- _main:
            ;-- func.100000640:
┌ (fcn) entry0 251
│   entry0 (int32_t arg1, int32_t arg2);
│ bp: 10 (vars 10, args 0)
│ sp: 0 (vars 0, args 0)
│ rg: 2 (vars 0, args 2)
│           0x100000640      55             push rbp                   ; [00] -r-x section size 9476 named 0.__TEXT.__text
│           0x100000641      4889e5         mov rbp, rsp
│           0x100000644      4881ec900000.  sub rsp, 0x90
│           0x10000064b      c745fc000000.  mov dword [var_4h], 0
│           0x100000652      897df8         mov dword [var_8h], edi    ; arg1
│           0x100000655      488975f0       mov qword [var_10h], rsi   ; arg2
│           0x100000659      488d7dd8       lea rdi, [var_28h]
│           0x10000065d      e84e010000     call method.std::__1::vector_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_____std::__1::allocator_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_______.vector
│           0x100000662      488d352b2800.  lea rsi, str.hello         ; section.4.__TEXT.__cstring
│                                                                      ; 0x100002e94 ; "hello"
│           0x100000669      488d7dc0       lea rdi, [var_40h]
│           0x10000066d      e80e020000     call method.std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char___::basic_string_std::__1.nullptr_t__char_const
│       ┌─< 0x100000672      e900000000     jmp 0x100000677
│       │   ; CODE XREF from entry0 @ 0x100000672
│       └─> 0x100000677      488d7dd8       lea rdi, [var_28h]
│           0x10000067b      488d75c0       lea rsi, [var_40h]
│           0x10000067f      e84c010000     call method std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >::push_back(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&) ; method.std::__1::vector_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_____std::__1::allocator_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_______.push_back_std::__1::basic_string
│       ┌─< 0x100000684      e900000000     jmp 0x100000689
│       │   ; CODE XREF from entry0 @ 0x100000684
│       └─> 0x100000689      488d7dc0       lea rdi, [var_40h]
│           0x10000068d      e8fa240000     call sym std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >::~basic_string() ; sym.std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char___::_basic_string
│       ┌─< 0x100000692      e900000000     jmp 0x100000697
│       │   ; CODE XREF from entry0 @ 0x100000692
│

In the terminal, the line is long so that it wraps into the next line. It is not convenient to read the asm code.

And another big issue I have found in visual mode, the line calling STL function is not show fully which makes big trouble to figure out what the function is. Showed in the below image:

image

My environment information:

OS
    Mac Mojave 10.14.4
 r2

    radare2 3.7.0 22641 @ darwin-x86-64 git.3.7.0-55-g40f4db6ea
    commit: 40f4db6eaaa709261db119ed3d791cd71930736c build: 2019-08-12__14:54:25

The compiled binary is here:

democ++.zip

XVilka commented 5 years ago

Please attach the binaries.

ikey4u commented 5 years ago

Please attach the binaries.

The attachment has been uploaded.

meme commented 5 years ago

This is just a result of template expansions in the STL, the job of the reverser is to discern the exact meaning, though I do agree that reading C++ symbols in radare2 is a complete pain in the ass since the range of characters used in demangled C++ symbols (<, >, etc.) are not used in radare2's symbol listing. For perspective, IDA simply leaves the name in its mangled form and leaves a comment with the demangled form adjacent in the linear disassembly view. In Binary Ninja the "proper" demangled name is displayed in all occurrences, and it will as well "shorten" names like

call method.std::__1::vector_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_____std::__1::allocator_std::__1::basic_string_char__std::__1::char_traits_char___std::__1::allocator_char_______.vector

to

call    std::__1::vector<std::__...d::__1::allocator<char> > > >::vector

Perhaps an option for eliding "unimportant" template internals in the disassembly listing is a feature worth the work...

radare commented 5 years ago

e bin.demangle=false or e asm.demangle=false should be good for you i guess but its indeed a pain

On 15 Aug 2019, at 23:48, meme notifications@github.com wrote:

This is just a result of template expansions in the STL, the job of the reverser is to discern the exact meaning, though I do agree that reading C++ symbols in radare2 is a complete pain in the ass since the range of characters used in demangled C++ symbols (<, >, etc.) are not used in radare2's symbol listing. For perspective, IDA simply leaves the name in its mangled form and leaves a comment with the demangled form adjacent in the linear disassembly view. In Binary Ninja the "proper" demangled name is displayed in all occurrences, and it will as well "shorten" names like

call method.std::1::vector_std::1::basic_string_charstd::__1::char_traitscharstd::1::allocator_char___std::1::allocator_std::1::basic_string_charstd::__1::char_traitscharstd::1::allocatorchar____.vector to

call std::1::vector<std::...d::__1::allocator > > >::vector Perhaps an option for eliding "unimportant" template internals in the disassembly listing is a feature worth the work...

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.