Open j123123 opened 7 years ago
this extension https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html can be used
Sounds like a good idea but i would base this as a transformation of esil
On 29 May 2017, at 09:07, szt notifications@github.com wrote:
this extension https://gcc.gnu.org/onlinedocs/gcc/Labels-as-Values.html can be used
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
https://raw.githubusercontent.com/Javanaise/mrboom-libretro/master/mrboom.c look how this translated source look like (huge file)
Hmm, maybe esil to GCC's GIMPLE or GENERIC internal representation https://gcc.gnu.org/onlinedocs/gccint/GIMPLE.html https://gcc.gnu.org/onlinedocs/gccint/GENERIC.html (add esil frontend to GCC)?
or LLVM?
@j123123 check radeco project https://github.com/radare/radeco-lib
@XVilka Decompilers generally not designed to produce compilable source, it generally designed to produce human-readable source. If need to port binary from one ISA to another, no need to make translated source human-readable.
And compilers can use dirty tricks, which cannot be decompiled in easy way. For example, when you have two very similar functions and compiler decides separate a common piece of assembly and make jump to it (IAR compiler can do such things).
And I don't like Rust memory model, (just try to implement double linked list in Rust https://www.reddit.com/r/rust/comments/2u53le/this_is_a_doubly_linked_list_in_safe_rust/ ). As for me, it's bad approach to make things safe and fast. Just need to prove everything using Frama-C like stuff, SMT solvers. And disallow to compile unproved code (unless you use "unsafe" keyword or add runtime bounds checking to every unproved place)
btw https://www.reddit.com/r/linux/comments/200jd0/super_genius_notaz_ports_starcraft_to_armwine/ https://github.com/notaz/ia32rtools
You're speaking about binary reassembly, and it's even more challenging task rather than writing a decompiler. Moreover it's a rare need, unlike decompiler, so judging from the complexity/demand ratio, decompiler is higher prio. Anyway full-featured data flow analysis is required for complete binary reassembly, so it's kind of including decompilation task.
I dont think what he proposes is harder than a decompilation. And ive did that by hand in the past when copypasting blocks of disasm into c doesnt works because of arch or memory layout restrictions, etc
I think thats pretty useful and should be easy to do as a transformation of esil. Like it was done fot reil (generated from esil)
On 1 Jun 2017, at 06:06, Anton Kochkov notifications@github.com wrote:
You're speaking about binary reassembly, and it's even more challenging task rather than writing a decompiler. Moreover it's a rare need, unlike decompiler, so judging from the complexity/demand ratio, decompiler is higher prio. Anyway full-featured data flow analysis is required for complete binary reassembly, so it's kind of including decompilation task.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
Can this be closed in favor of https://github.com/radareorg/radeco, https://github.com/wargio/r2dec, and others?
Well, if it can produce correct and compileable code - then yes. I mean, how about case when jump in the middle of instruction is performed (like in example on image)? Does it work?
What I propose is not like radeco or r2dec
https://github.com/wargio/r2dec-js#r2dec-pseudo-c-code - here I see some while() loop - this is not this case.
My idea: every instruction must be converted in code chunk with gotos. Every code chunk must change some global variables, which emulating registers. For example CMP
instruction changes some flag variables, and JNE
instruction read that variables and doing jump to somewhere and jump to the next instruction (code chunk) otherwise.
For example: '4839d875fe'
$ rasm2 -d -b 64 '4839d875fe'
cmp rax, rbx
jne 3
$ rasm2 -d -b 64 '39d875fe'
cmp eax, ebx
jne 2
rasm2 -d -b 64 'd875fe'
fdiv dword [rbp - 2]
$ rasm2 -d -b 64 '75fe'
jne 0
translate to
label0x00:
{ // cmp rax, rbx : "4839d8"
// set CF, OF, SF, ZF, AF, and PF flags according to the result.
goto label0x03;
}
label0x01:
{ // cmp eax, ebx : "39d8"
// set CF, OF, SF, ZF, AF, and PF flags according to the result.
goto label0x03;
}
label 0x02:
{ // fdiv dword [rbp - 2] : "d875fe"
// some C code which doing manipulation with array
// (array emulating FPU stack) and fetch data from rbp - 2
// and do division
float tmp;
memcpy(*tmp, (void *)((uintptr_t)rbp-2), sizeof(float));
// and do check if 0 division, and jump to exception handler stuff
// actually, need to check control register stuff https://wiki.osdev.org/FPU#FPU_control
// etc, etc
....
}
label0x03:
{ // jne 3 : "75fe"
if (eflags.zf == 0)
{
goto label0x03;
}
goto label 0x05;
}
...
R2 code analysis handles this jump in the middle thing yes
On 18 Nov 2018, at 01:47, j123123 notifications@github.com wrote:
Well, if it can produce correct and compileable code - then yes. I mean, how about case when jump in the middle of instruction is performed (like in example)? Does it work?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
@radare it's nice but my idea is very different from radeco or re2c and even Hex-Rays from IDA
This things created to understand what actually happening, generating some pseudo-C output (which has no purpose to be to 100% correct and compileable). Hex-Rays for example trying to detect local variables on stack, it have some stuff inside to detect calling conversion, how arguments transferred and who clean stack (calling or called function) etc, etc. If you simply convert every instruction (call
, ret
) in chunk of code which jump to pointer and push return address to stack (call
) or pop pointer from stack and jump in it (ret
) you don't have to care about calling conversion, function prologue/epilogue detection and other stuff like that.
This can be done with esil. Replace all the operations to add your callbacks and add callbacks when rega are accessed etc, then demangle the expresion into C like code
This is done by the reil conversion command aetr
On 18 Nov 2018, at 18:01, j123123 notifications@github.com wrote:
@radare it's nice but my idea is very different from radeco or re2c and even Hex-Rays from IDA This things created to understand what actually happening, generating some pseudo-C output (which has no purpose to be to 100% correct and compileable). Hex-Rays for example trying to detect local variables on stack, it have some stuff inside to detect calling conversion, how arguments transferred and who clean stack (calling or called function) etc, etc. If you simply convert every instruction (call, ret) in chunk of code which jump to pointer and push return address to stack (call) or pop pointer from stack and jump in it (ret) you don't have to care about calling conversion, function prologue/epilogue detection and other stuff like that.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.
Great, but I don't see asm-to-esil implementation for x86 FPU https://github.com/radare/radare2/blob/dd84bfe3dee230feb542908870b3a731481eae63/libr/anal/p/anal_x86_cs.c#L432-L438 When it will be available? Any plans for implement it?
And what about REIL (OpenREIL)? Maybe it's better to do OpenREIL -> C instead ESIL -> C. Need to think about how better implement this
No practical not abandoned tool uses REIL in 2018, this language is old. If you want something more high-level you can use RadecoIL instead.
On Tue, Nov 20, 2018, 11:09 AM j123123 <notifications@github.com wrote:
And what about REIL (OpenREIL)? Maybe it's better to do OpenREIL -> C instead ESIL -> C. Need to think about how better implement this
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7617#issuecomment-440121746, or mute the thread https://github.com/notifications/unsubscribe-auth/AAMZ_Rq3pY9HVr83TSGlzIXEnxq21EI6ks5uw3KAgaJpZM4No9uq .
reil is pretty limited, bad designed and abandoned
On 20 Nov 2018, at 04:09, j123123 notifications@github.com wrote:
And what about REIL (OpenREIL)? Maybe it's better to do OpenREIL -> C instead ESIL -> C. Need to think about how better implement this
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/radare/radare2/issues/7617#issuecomment-440121746, or mute the thread https://github.com/notifications/unsubscribe-auth/AA3-lmI3biDmgG8fba2jUmn_WTYXpikNks5uw3KBgaJpZM4No9uq.
This is not that bad idea because it will make esil expressions more "readable" which is a common complain from some people, and also give us the ability to have JIT in ESIL like Ruby does.
cc @condret
For example here is byte sequence
48 B8 01 48 31 C0 48 8D 04 18 EB F7
and we need to make some C code from it (to port into ARM arch for example) so it is possible to make something like this:see also: https://github.com/frranck/asm2c