radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
20.37k stars 2.97k forks source link

Question about imports #14843

Open laminenoureddine opened 5 years ago

laminenoureddine commented 5 years ago

Hello,

A very simple question: in the sequence below, how can I with ESIL, don't get into the code of sym.imp.func:

address1 call dword [sym.imp.function] address2 mov ...

Ideally, when I meet a call [sym.import...] , I want just to replace the address of import by the address +1 of the actual sequence(i.e. mov instruction on that case). I don't want to make a step over, because I need to see the code of other calls. My concern is only about imports.

Could you please help with some commands that automate that for all sym.imports that I can find in the binary.

Thanks a lot for your help in advance

radare commented 5 years ago

with ahe you can replace the esil expression at any address. Esil expressions can contain r2 commands or run r2pipe scripts that change any register or memory value. You can also use aep to set an esil pin.. this is basicslly the same as an esil hint but only to run r2 commands instead of replacing esil expressions.

The help msg of aep is not clear at all.. but heres a simple example:

$ r2 -

aep ?e hello world @ 0 aes Hello world

This esil pin thing was kind of idea before esil expressions allowed to run r2 commands ans that was stuck waiting for feedback 2 years ago. So i would be happy to hear about your comments on this.

How would u expect to solve this without knowing any of this? I would be happy to deprecate the esil pins and just use esil hints. But we should provide an easy way to reimplement most libc imports and do that with one command or using a script that comes with r2.. maybe having some custom implementations like we have for syscalls,

Anither option i would think is to treat calls specially and be able to reimplement or ignore any call that falls in a memory range (plt). And just return 0. (We have wao to write portable assembly). So another solution is to use io.cache and reimplement those instructions by patching them in memory.

As i said. I understand this is a very common problem when using esil for real many cases and i want/need feedback to get this right/tight so it makes sense for everyone and not just me :) but as i said there r already a bunch of possible solutions right now, like the esil hint way

ahe rsp,[8],rip,=,8,rsp,-= @@ sym.imp*

Obv this is not portable, and i think we should have a bunch of crossplatform esil expressions or way to ignore calls.

Feedback is welcome. If u search for “aep” and “ahe” is issues u will find other attempts to simplify this issue

On 19 Aug 2019, at 00:31, lamin3 notifications@github.com wrote:

Hello,

A very simple question: in the sequence below, how can I with ESIL, don't get into the code of sym.imp.func:

address1 call dword [sym.imp.function] address2 mov ...

Ideally, when I meet a call [sym.import...] , I want just to replace the address of import by the address +1 of the actual sequence(i.e. mov instruction on that case). I don't want to make a step over, because I need to see the code of other calls. My concern is only about imports.

Could you please help with some commands that automate that for all sym.imports, and for for any binary.

Thanks a lot for your help in advance

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

laminenoureddine commented 5 years ago

@radare : Thanks a loooot for your very detailed answer. It was definitely helpful. And sorry for the super late feedback.

Going back to the context of my issue: I try to extract the first 30-50 mnemonics of Windows binaries. And since I'm under Linux, I use ESIL for this process of tracing. I want traces that look like: [push, mov, pop, call, .......50th_menmonic] for each binary. And the problem was around the Windows Api calls whose the emulated code is not present, therefore when any Windows api call is met, the flow goes to fffffff sections. Reason why I opened the issue about that.

I tested multiple solutions, focusing on what you suggested me i.e. the ones with "ahe" and "aep". And finally it was possible with "aep". Now when an Windows Api calls is met, the hook puts the eip (or aepc program counter since we are talking about ESIL) at the return address.

I list below the commands used for tracing:

I attached also two screenshots:the first is just to show you where was the problem and the second is to show the commands I used above, and I join also the binary tested.

Armadillo_aitagent.exe.zip Screenshot from 2019-08-27 19-46-15 Screenshot from 2019-08-27 19-38-30

Again, thanks a loot for your very helpful very guiding answer. And please, if you see that the solution may not work for other cases or simply doesn't correspond with the issue please give me a feedback on that.

radare commented 5 years ago

Yeah that works but its a bit tedious to go for each function. Maybe good to define some boundsries by default or provide implementations for most common functions like for libc. Do u have any proposal to aimplify and make this with commands?

Good to hear that it worked! And thanks for the screenshots and explanation! Issues are usually a good way to find for answers

On 27 Aug 2019, at 19:52, lamin3 notifications@github.com wrote:

@radare : Thanks a loooot for your very detailed answer. It was definitely helpful. And sorry for the super late feedback.

Going back to the context of my issue: I try to extract the first 30-50 mnemonics of Windows binaries. And since I'm under Linux, I use ESIL for this process of tracing. I want traces that look like: [push, mov, pop, call, .......50th_menmonic] for each binary. And the problem was around the Windows Api calls whose the emulated code is not present, therefore when any Windows api call is met, the flow goes to fffffff sections. Reason why I opened the issue about that.

I tested multiple solutions, focusing on what you suggested me i.e. the ones with "ahe" and "aep". And finally it was possible with "aep". Now when an Windows Api calls is met, the hook puts the eip (or aepc program counter since we are talking about ESIL) at the return address.

I list below the commands used for tracing:

aeim (to initialize the registers ....) e dbg.trace= true (enable logging traces) aep aepc=[esp] @@ reloc* (here is the solution that worked for me) 50aes (extract 50 mnemonics) dtd (output the traces) I attached also two screenshots:the first is just to show you where was the problem and the second is to show the commands I used above, and I join also the binary tested.

Armadillo_aitagent.exe.zip

Again, thanks a loot for your very helpful very guiding answer. And please, if you see that the solution may not work for other cases or simply doesn't correspond with the issue please give me a feedback on that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.