avast / retdec

RetDec is a retargetable machine-code decompiler based on LLVM.
https://retdec.com/
MIT License
7.96k stars 939 forks source link

Function arguments wrong #583

Open jpmorrison opened 5 years ago

jpmorrison commented 5 years ago

Am I doing something wrong here? I get a file.exe.c but the function arguments look wrong - they are all the same. I've tried retdec-v3.3-windows-64b.zip, retdec-v3.3-windows-32b.zip and retdec-v3.3-ubuntu-64b.zip with the same results

I run the following which completes with a couple of warnings about break outside of a loop or a switch statement.

py -3 retdec-decompiler.py file.exe

Here's some output: all function arguments are &g16

` int32_t v36 = function_402bb0((int32_t)&g16, (int32_t)&g16, (int32_t)&g16); // 0x4032e9 if (v36 == 0) { // 0x403317 (int32_t )(g9 - 4) = (int32_t )(g9 + 56); (int32_t )(g9 - 8) = (int32_t )(g9 + 52); (int32_t )(g9 - 12) = (int32_t )(g9 + 48); (int32_t )(g9 - 16) = (int32_t )(g9 + 44); (int32_t )(g9 - 20) = (int32_t )(g9 + 40); (int32_t )(g9 - 24) = (int32_t)"activation-key 0x%08x 0x%08x 0x%08x 0x%08x 0x%08x"; (int32_t )(g9 - 28) = g9 + 60; wsprintfA((char )&g16, (char )&g16); (int32_t )(g9 + 24) = g9 + 88; (int32_t )(g9 + 20) = 0; (int32_t )(g9 + 16) = g6; SetDlgItemTextA(&g16, (int32_t)&g16, (char *)&g16);

` Here's some of the dissassembly after function_402bb0() and before wsprintfA():

` call MD5_sub_402BB0 ; Get ready to print key test eax, eax jz short loc_403317 push 30h ; uType push offset Caption ; "ERROR" push offset Text ; "Invalid combination of paramaters" push edi ; hWnd call MessageBoxA pop edi pop esi pop ebp mov eax, 1 pop ebx add esp, 0D0h retn 10h ; ---------------------------------------------------------------------------

loc_403317: ; CODE XREF: DialogFunc+5B0↑j mov edx, [esp+0E0h+var_A8] mov eax, [esp+0E0h+var_AC] mov ecx, [esp+0E0h+var_B0] push edx ; part 5? mov edx, [esp+0E4h+var_B4] push eax ; part 4? mov eax, [esp+0E8h+var_B8] push ecx ; part 3 push edx ; part 2 push eax ; part 1 lea ecx, [esp+0F4h+String] push offset aActivationKey0 ; "activation-key 0x%08x 0x%08x 0x%08x 0x%"... push ecx ; LPSTR call wsprintfA ; Print the key `

s3rvac commented 5 years ago

@PeterMatula Can you please take a look at this?

PeterMatula commented 5 years ago

@xkubov will take a look.

xkubov commented 5 years ago

Hi, by looking at the disassembly code that you have provided I cannot deduce why would RetDec output such C code.

Are you able to provide the binary file which you tried to decompile?

jpmorrison commented 5 years ago

Can I send the file privately? Also file is 34/66 on virustotal but I think that's a false positive

xkubov commented 5 years ago

Yes of course. You can send it to my email adress xkubov@gmail.com.

xkubov commented 5 years ago

I received your email and was able to reproduce your issue.

The invalid call of functions wsprintfA and function_402bb0 was created because function optimization wasn't able to find proper arguments of these function calls. The optimization was able to find out that function function_402bb0 has 4 parameters by analyzing its definition. However, during the analysis of each call of this function, optimization was unable to find proper stack variables that should be used for passing arguments. This is why it used a dummy register that was later called g16 to tell that there should be 4 arguments and that these four arguments of function call were not properly found.

Why these arguments weren't found?

I analyzed LLVM IR input for this analysis and it turned out that stack analysis did not recognize a lot of stack variables in this input file. Before the call of function function_402bb0 there seem to be an attempt to put a value of register ECX on the stack, but this attempt was not detected by stack analysis:

The following sequence should generate new stack variable:

  %2248 = load i32, i32* @ecx
  %2249 = load i32, i32* @esp
  %2250 = sub i32 %2249, 4
  %2251 = inttoptr i32 %2250 to i32*
  store i32 %2248, i32* %2251
  store i32 %2250, i32* @esp
twitzelbos commented 5 years ago

I have a similar issue with decompiling code that was compiled with MSVC++ 6.0. The decompiled code looks something like this:

                        int32_t v35 = *(int32_t *)(v6 + 60) / 2; // 0x40123d
                        g2 = v35;
                        *(int32_t *)(v14 - 4) = v35;
                        g7 = v14 + 3668;
                        *(int32_t *)(v14 - 8) = v6 + 64;
                        *(int32_t *)(v14 - 12) = g7;
                        int16_t * v36 = wcsncpy((int16_t *)&g80, (int16_t *)&g80, (int32_t)&g80); // 0x401249

which looks wrong on the function arguments to me. The according dsm section is like this:

0x40121e:   8b 6c 24 20                         mov ebp, dword ptr [esp + 0x20]
0x401222:   8b 44 24 10                         mov eax, dword ptr [esp + 0x10]
0x401226:   89 6c 24 18                         mov dword ptr [esp + 0x18], ebp
0x40122a:   8d 14 80                            lea edx, [eax + eax*4]
0x40122d:   c1 e2 04                            shl edx, 4
0x401230:   8d 9c 14 54 10 00 00                lea ebx, [esp + edx + 0x1054]
0x401237:   8b 45 3c                            mov eax, dword ptr [ebp + 0x3c]
0x40123a:   8d 4d 40                            lea ecx, [ebp + 0x40]
0x40123d:   d1 e8                               shr eax, 1
0x40123f:   50                                  push eax
0x401240:   8d 94 24 58 0e 00 00                lea edx, [esp + 0xe58]
0x401247:   51                                  push ecx
0x401248:   52                                  push edx
0x401249:   ff 15 30 61 40 00                   call dword ptr [0x406130] <wcsncpy>

Is this the expected output, or is there something amiss.

Thank you for your help.

xkubov commented 5 years ago

By looking at the code snippets that you have provided it seems like the same issue, but I cannot be sure without trying to decompile the file.

It seems like the optimization responsible for analyzation of function arguments was provided information about types and names of standard function wcsncpy and after analyzation of code before the call of this function, it was unable to find correct stack variables used to pass arguments so it used a dummy register instead. If this issue is related then the error is on the side of stack analysis that was unable to create required stack variables.

If you can provide the binary file you have tried to decompile then it might help with detection of the true source of the problem.

twitzelbos commented 5 years ago

I would like to dig into this myself a little bit more as well, might be a good exercise to learn the workings of the toolchain some more. I haven't found the detailed documentation yet for each of the steps. I have an older development system that can still compile code the same way this example was compiled. I'm curious, you say it cannot find the correct stack variables, there is literally three push commands right before that. You are saying those are optimized away somewhere? Where does that happen? Also, when I look into the json file, should it really say that the callingConvention is unknown? Its pretty standard no?

        {
            "callingConvention" : "unknown",
            "declarationStr" : "wchar_t * wcsncpy(wchar_t * restrict dest, const wchar_t * restrict src, size_t n);",
            "endAddr" : "0x406131",
            "fncType" : "dynamicallyLinked",
            "name" : "wcsncpy",
            "parameters" : 
            [
                {
                    "isFromDebug" : true,
                    "name" : "dest",
                    "realName" : "dest",
                    "type" : 
                    {
                        "llvmIr" : "i16*"
                    }
                },
                {
                    "isFromDebug" : true,
                    "name" : "src",
                    "realName" : "src",
                    "type" : 
                    {
                        "llvmIr" : "i16*"
                    }
                },
                {
                    "isFromDebug" : true,
                    "name" : "n",
                    "realName" : "n",
                    "type" : 
                    {
                        "llvmIr" : "i32"
                    }
                }
            ],
            "returnType" : 
            {
                "llvmIr" : "i32"
            },
            "startAddr" : "0x406130",
            "usedCryptoConstants" : []
        },
twitzelbos commented 5 years ago

Okay, I have a simple example here that has a ton of these errors on decompile ...

bios_serial_number.zip