radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.71k stars 3.01k forks source link

Use mangled symbol names for zignatures #21546

Open trufae opened 1 year ago

trufae commented 1 year ago

As mentioned in https://github.com/radareorg/radare2/issues/21536#issuecomment-1489587530 by @yuzhichang

The signature name is converted from the mangled C++ function name. However it's not possible to infer the mangled C++ function name from the signature name. This cause a minor difficulty to identify which function of a signature is for. I would prefer use the mangled one as the signature name, use the demangled one as signature types.

For example,

zhichyu@ck98:~/grpc_whl$ nm notstripped/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so | grep grpc_chttp2_list_pop_writing_stream
000000000017ff10 t _Z35grpc_chttp2_list_pop_writing_streamP21grpc_chttp2_transportPP18grpc_chttp2_stream
zhichyu@ck98:~/grpc_whl$ c++filt _Z35grpc_chttp2_list_pop_writing_streamP21grpc_chttp2_transportPP18grpc_chttp2_stream
grpc_chttp2_list_pop_writing_stream(grpc_chttp2_transport*, grpc_chttp2_stream**)
The signature for this function is:

    {
        "addr": 1572624,
        "name": "dbg.grpc_chttp2_list_pop_writing_stream_grpc_chttp2_transport__grpc_chttp2_stream_",
        "next": "dbg.grpc_chttp2_list_add_waiting_for_concurrency(grpc_chttp2_transport*, grpc_chttp2_stream*)",
        "types": "void dbg.grpc_chttp2_list_pop_writing_stream_grpc_chttp2_transport__grpc_chttp2_stream_ ()",
...
    }
radare commented 1 year ago

I have added zign.mangled eval var (disabled by default) and added a test for this. the symbol names are not demanlged when applied later. so this must be done later after some more testing. can you please test it and confirm the behaviour is correct for you?

thank you!

yuzhichang commented 1 year ago

@radare With e zign.mangled=true, zg hangs, cost 100% CPU, and the RES becomes larger and larger.

Mon Apr 10 10:49:46 AM CST 2023
radare2 5.8.5 30228 @ linux-x86-64 git.5.8.4-111-g4968d69f18
commit: 4968d69f1848c2f9f2333d649a845009b59ba2a7 build: 2023-04-10__09:53:17
Linux x86_64

The so is notstripped/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so.

perf record output:

-   41.02%     0.00%  radare2  [.] __libc_start_call_main                          ▒
     __libc_start_call_main                                                        ▒
     main                                                                          ◆
     r_main_radare2                                                                ▒
     r_core_prompt_loop                                                            ▒
     r_core_prompt_exec                                                            ▒
     r_core_cmd                                                                    ▒
     run_cmd_depth                                                                 ▒
     r_core_cmd_subst                                                              ▒
     r_core_cmd_subst_i                                                            ▒
     r_cmd_call                                                                    ▒
     cmd_zign                                                                      ▒
     cmdAdd                                                                        ▒
   + r_sign_all_functions                                                          ▒
+   24.66%    18.08%  radare2  [.] _int_malloc                                     ▒
+   15.43%    14.91%  radare2  [.] malloc_consolidate                              ▒
trufae commented 1 year ago

I dont think it hangs but it takes much more time because resolving symbols via rbin is o(n) and that bin have a lot of aymbols. To make it o(1) i need to break the abi or use globals. Can u try with a smaller reproducer? I plan to implement it propeely flr 5.9 where i can break the abi

yuzhichang commented 1 year ago

@trufae I confirm that e zign.mangled=true, zg use the mangled one as the signature name. Thanks for your help!

yuzhichang commented 1 year ago

Update: zg for notstripped/1.41.1-cygrpc.cpython-310-x86_64-linux-gnu.so cost ~116 minutes.

trufae commented 1 year ago

Yeah that's quite a lot of time because of the O(n) problem of finding symbols in a list. i'll have some time this week to cache that into a hashtable, but i would prefer to break abi and store the mangled name inside the Function struct instead

trufae commented 9 months ago

All the mangling symbol naming logic has been wrapped into rbinname. This shoyld be enough to get this issue resolved. Can you give it a try again and see if the current behaviour works best for u? We have only one test about zignatures and mangling but as long as the code logic is stable now it should be good to add more tests and review the way it works

trufae commented 9 months ago

@as0ler did a bunch of fixes in this logic, we should be adding more tests and verify everything works as intended now. can anyone do these tests?

yuzhichang commented 9 months ago

@trufae I tested the revision 7552613e0959029f4bad5a3bd2e74e3f8932b337, e zign.mangled=true;e anal.hasnext=true;afr;aac;zg; zos 1.json doesn't work as expected! A piece of 1.json is:

"name": "sym.grpc_chttp2_list_pop_waiting_for_concurrency_grpc_chttp2_transport__grpc_chttp2_stream_",
"next": "sym.grpc_chttp2_list_remove_waiting_for_concurrency_grpc_chttp2_transport__grpc_chttp2_stream_",

What I expect is:

"name": "sym._Z44grpc_chttp2_list_pop_waiting_for_concurrencyP21grpc_chttp2_transportPP18grpc_chttp2_stream",
"next": "sym._Z47grpc_chttp2_list_remove_waiting_for_concurrencyP21grpc_chttp2_transportP18grpc_chttp2_stream",
trufae commented 8 months ago

enotime to check this issue right now, i have many other urgent tickets and in theory it shuold not be needed to break ABI to fix it after the RBinName refactor. So i'm calling @as0ler in case he can take a look to fix it meanwhile. but im moving the ticket forward to not block the already delayed release.

sorry for the slow timings