avast / retdec

RetDec is a retargetable machine-code decompiler based on LLVM.
https://retdec.com/
MIT License
8.03k stars 949 forks source link

Decompilation takes too much memory #13

Open invokr opened 6 years ago

invokr commented 6 years ago

The file I'm decompiling is kinda publicly available but I'm not sure how to share it with devs because the people who have written the code probably don't want it to be decompiled in the first place.

I'd be happy to send the file to a dev via mail or some other means.

I'm running: ./decompiler.sh <path/to/dll> Which at some point runs: llvmir2hll -target-hll=c -var-renamer=readable -var-name-gen=fruit -var-name-gen-prefix= -call-info-obtainer=optim -arithm-expr-evaluator=c -validate-module -llvmir2bir-converter=orig -o [...].dll.c [...].dll.c.backend.bc -enable-debug -emit-debug-comments -config-path=[...].dll.c.json

Once it reaches the step below, it slowly starts eating memory until at one point it quickly goes from 8GB -> 32 GB and then get's killed by docker.

How would I help debugging this? My guess would be it's running into some infinite depth loop.

Running phase: optimizations [normal] ( 52.09s )
 -> running RemoveUselessCastsOptimizer ( 52.09s )
 -> running UnusedGlobalVarOptimizer ( 52.63s )
 -> running DeadLocalAssignOptimizer ( 53.80s )
 -> running SimpleCopyPropagationOptimizer ( 65.16s )
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v1_1002017b = (v3_10020166 + 1)` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v1_1002031f = (v3_1002030a + 1)` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v1_100200c9 = (v3_100200b4 + 1)` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v1_10020f22 = (v3_10020f0d + 1)` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `goto 0x10020eca` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `eax_global_2d_to_2d_local = v3_10021abe` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `eax_global_2d_to_2d_local = v3_100224d9` -> skipping this edge
Warning: [NonRecursiveCFGBuilder] there is no node for an edge to `v10_1002391d = v2_1002386d` -> skipping this edge
tpruzina commented 6 years ago

Can you upload your dll somewhere? Looks interesting. As a sidenote, perhaps decompiler.sh could add memory usage limiting (controlled by envvar) somewhere, should be 3-liner for linux, no idea about windows.

invokr commented 6 years ago

https://www.file-upload.net/download-12866451/2mbTurnsInto32Gigs.dll.html

Def. worth looking into in terms of denial of service. I've renamed the original dll, like I said, this isn't from an open source project or anything so I'm not sure regarding the legality of decompiling it.

s3rvac commented 6 years ago

Thank you for the report. We will investigate it. To limit virtual memory on Linux, you can run e.g. ulimit -Sv 8388608 (8 GB limit) prior to running decompile.sh.

ExtReMLapin commented 6 years ago

fokin hell

humblepride commented 6 years ago

I have the same problem, is there any fix for this for now?

Manouchehri commented 6 years ago

@s3rvac Do your servers at work have 128 GB of RAM? I'm wondering if this is a larger design issue; formerly an analyst would be using RetDec on a remote server via the API, while with the open source release most analysts are running directly on their own low powered desktops/laptops.

That being said, somebody (or me eventually) should probably generate a couple flame graphs to check if there's any easy problems to squash.

invokr commented 6 years ago

With the testing I did yesterday I can already say that turning off optimizations (-no-opts) prevents this bug from occurring. Seeing that the optimizers running are printed, the bug would be most likely in the SimpleCopyPropagationOptimizer.

s3rvac commented 6 years ago

We are aware of the memory/speed issues when decompiling certain binaries. In most cases, they are caused either by invalid instruction decoding (when e.g. data are decompiled as code) or simply by the size of the code that is being decompiled (the larger the binary is, the slower and more memory-intensive its decompilation may be). Many cases are already reported in our original bug tracker, which we want to migrate into GitHub.

We plan to provide more information concerning these performance issues. Please, bear with us. There are a lot of things that still need to be done (fix of build issues on different systems, migration of our old bug tracker, wiki pages, documentation, tests, continuous integration, dependency management, roadmap, etc.). We try to do our best.

Until then, I would just like to point out that in most cases, these issues are NOT caused by a single, particular bug somewhere it the decompiler. Rather, improvements in multiple parts of the decompiler will be needed.

Convery commented 6 years ago

After 20 hours it crashed after having gone through 110GB of RAM, so ye, memory requirements =P

Might help narrow it down though, as Control flow optimization (~10GB RAM) and Global to local optimization (~40GB -> crash) seems to take all the time and memory.

Convery commented 6 years ago

Yep, removing those from the decompile.sh makes it finish in 90min and 10GB RAM.

Aaaaand three and a half hour later..

ghost commented 6 years ago

It breaks at 2Mb? I've got a binary that's 45 megabytes... what are my options? I've tried using --backend-no-opts and it'll just get stuck at Running phase: Input binary to LLVM IR decoding ( 1.48s )

It's using about 2.9G of RAM, but the timer isn't changing. I'm guessing I don't even get to that part that supposedly uses a lot of ram. Server has 110G of ram and 12 cores

s3rvac commented 6 years ago

@mhsjlw:

It breaks at 2Mb?

It depends on the input binary. Many times, even larger binaries can be decompiled without a problem.

I've got a binary that's 45 megabytes... what are my options?

Generally, when a decompilation takes too much time or consumes too much memory, you can perform a selective decompilation, which will decompile only a part of the input file. For example, by running

decompile.sh FILE --select-functions func1,func2 --select-decode-only

the decompiler will decode and decompile only functions named func1 and func2. However, note that this works only when the input executable has symbolic information in it (= function names). If it does not, you will have to specify address ranges to be decompiled via e.g.

decompile.sh FILE --select-ranges 0x404000-0x405000 --select-decode-only

This will decompile everything between addresses 0x404000 and 0x405000.

I've tried using --backend-no-opts and it'll just get stuck at Running phase: Input binary to LLVM IR decoding ( 1.48s )

This parameter only applies to the back-end phase (llvmir2hll). The phase you have mention is from the front-end part (bin2llvmir) + the decoding phase is mandatory as without decoding, the decompilation cannot go further.

ghost commented 6 years ago

Ok, I'll give the selective decompilation a try, thanks.

kiritowch commented 6 years ago

I think you can try some selective decompilation...I've had the same problem. I'll tell you when I try

Redict commented 6 years ago

Same problem with another DLL. On Linux eats 16 GB RAM

MerovingianByte commented 6 years ago

Maybe the code could be refactored to use binary trees or some other memory efficient data structure.

waldi commented 6 years ago

Same for me image

screen shot 2018-01-16 at 09 28 25

My Executable file is 16.4 MB big.

PeterMatula commented 6 years ago

Few big files that we could test, and eventually handle.

MerovingianByte commented 6 years ago

@PeterMatula , you mean before this issue was opened or after?

PeterMatula commented 6 years ago

@MerovingianByte These are files from our internal issue-tracking system that I'm currently migrating here to GitHub. So they are from before this issue. The cause is mostly the same. We should eventually handle all the files from this issue -- both those I just added, and those from other users.

MerovingianByte commented 6 years ago

@PeterMatula I see. How much memory do those big files need?

PeterMatula commented 6 years ago

@MerovingianByte No idea :D At the time of reporting the original issue, enough to take down the decompilation process -- even whole system if per process limits are not set. Thanks to the changes in decompiler, some of them might be better now, but all of them should be checked anyway.

MerovingianByte commented 6 years ago

@PeterMatula really looking forward to the fixing of all memory issues. I can't congratulate retdec enough. This project is just too good. The decompilation is more refined than hex-ray's. There's finally a viable, good, free alternative. Of course, there were other free alternatives before like snowman, but they were terrible. Is there an easy way to donate? If I was rich I'd be pouring money to this project. Like, shut up and take my money lol. But I think I can at least pay you a coffee.

eMVe commented 6 years ago

Same situation here. Decompiling 9MB exe ate all of my 64GB RAM.... Firstly DeadLocalAssignOptimizer said: warning: out of memory: trying to recover then SimpleCopyPropagationOptimizer crashed the windows... I don't have swap file as my configuration is 60GB SSD and 64GB of RAM. Update: --backend-no-opts helped. I think, that I need to disable DeadLocalAssignOptimizer as this is the "hungry guy" :), at least in my case.... Update: there are much more hungry guys, than this one...

phrohdoh commented 6 years ago

While this is indeed unfortunate, and I'm sure the retdec team is looking into it, users must deal with what we have available to use.

While analyzing the binary you feed it retdec creates a JSON document that describes the assembly (and the functions which is what we are mostly after). We can get function offsets and names from this.

To that end I have written a python 3 script that makes this somewhat easier (albeit still a bit tedious, I'm open to improvement suggestions).

#!/usr/bin/python3

import os
import sys
import json

try:
    path_to_decompiler_sh = sys.argv[1]
    path_to_file_to_decompile = sys.argv[2]
    path_to_retdec_json = sys.argv[3]
except IndexError:
    print("Usage: path/to/decompiler.sh path/to/file.exe path/to/retdec.json")
    sys.exit(1)

with open(path_to_retdec_json, 'r') as f:
    obj = json.load(f)
    for fn in obj['functions']:
        n = fn['name']
        if not n.startswith("function_"):
            continue

        s = hex(fn['startAddr'])
        e = hex(fn['endAddr'])
        cmd = "{} {} --select-ranges {}-{} --select-decode-only -o {}.fn" \
            .format(path_to_decompiler_sh,
                    path_to_file_to_decompile,
                    s,
                    e,
                    n)

        print(cmd)

Running the script:

$ ./the-script.py ~/bin/decompiler.sh path/to/my.exe /path/to/retdec_output.json

Will produce something similar to the following:

/home/thill/bin/decompiler.sh path/to/my.exe --select-ranges 0x5350e9-0x5350e9 --select-decode-only -o function_5350e9.fn
/home/thill/bin/decompiler.sh path/to/my.exe --select-ranges 0x53fbd8-0x53fbd8 --select-decode-only -o function_53fbd8.fn

Which I can then run directly.

You can of course modify the script to use subprocess/popen.

Hopefully this helps someone.

0xBEEEF commented 6 years ago

I would also like to add a little more. When optimizing in the backend, I mainly encounter problems with the SimpleCopyPropagationOptimizer. If I exclude it, I have no problems with most examples and tests. So there seems to be some mistake here.

In my case, the storage problems occur primarily in the backend. Especially here I can hardly use the optimizations, because there are crashes here again and again. What I see as a problem here is that if an optimization fails, the whole process is aborted immediately. Actually, only the current optimization failed. One could continue with the other optimizations or output the existing intermediate result. But there is always a hard break-up here. This is sometimes really stupid. It would be a long time if it could be improved here in error handling.

rkalz commented 6 years ago

@0xBEEEF What flag did you use/remove to disable that step?

@Phrohdoh Is there any way to combine those function files?

PeterMatula commented 6 years ago

Sample from #97 is also taking to much time and memory in llvmir2hll.

MPeti1 commented 6 years ago

If i have a new file that is also using too much memory should I open a new issue for it?

PeterMatula commented 6 years ago

No, just zip it and upload it here if you want to share it for testing.

MPeti1 commented 6 years ago

Ok, thanks, here is the link. Please note that this file is somewhat like the op's file in terms of availability, and that it's an android app's library working as a native Activity

twifty commented 6 years ago

I had a similar problem with a 1.2mb (not packed) file. DeadLocalAssignOptimizer issued "warning: out of memory: trying to recover" in a seemingly endless loop. Though only 17GB of a total 32GB RAM was actually used.

Adding the --backend-no-opts arg and the whole process completed in seconds.

denji commented 5 years ago
retdec-decompiler.py \
 --backend-disabled-opts=DeadCode,DeadLocalAssign,CopyPropagation,SimpleCopyPropagation \
 ...
Mortrus commented 5 years ago

I have same problem. I'm trying to decompile some.dll (11 Mb), and now I stuck for 20 hours at this step with enormous memory consumption: Безымянный

Decompilation settings are all default. Just set no memory limit. -_-

sooxt98 commented 4 years ago

image

same here, it just stop somewhere using half of my ram; im using 4.0

rmast commented 3 years ago

I finally decided to rent an Azure-server to be able to allow as much optimization as possible using virtual memory. I selected the default batch-runner Windows 2016 server image on a machine with 2 cores, 64GB RAM and 128GB SSD + 128GB TEMP disk. I added another 400GB SSD, partitioned, formatted and assigned it a drive letter, but that appeared not to be necessary as the decompilation until now hasn't taken more than about 100GB committed memory in the task overview.

I upped the swap-file to those disks outside C: to max-size, however enabling system managed swap on enough disks would already allow swap 3 times RAM = 192 GB,

I allowed IE usage, installed python 3.7.3 on the server, ran retdec-decompiler.py with the --no-memory-limit option, turned off the automatic turn off of the server at 7PM UTC, turned off Windows update that rebooted my system around 6 AM, and even installed Application Insights to follow the committed memory usage from outside the server without having to RDP to it.

At the moment the server decompile-process is finally surviving the night.

SamDecrock commented 1 year ago

As a reference, I once tried to decompile a 30MB binary. It ended up using 800GB of RAM 😮

kiritowch commented 1 year ago

这是一封自动回复邮件。已经收到您的来信,我会尽快回复。