cmu-sei / pharos

Automated static analysis tools for binary programs
Other
1.56k stars 192 forks source link

Partition stuck at 94%, seems to not be using available memory #243

Closed jsrolon closed 1 year ago

jsrolon commented 2 years ago

Hello! I am currently following the step-by-step guide, starting with the partitioning step. This is what I ran:

partition --serialize=v2g.ser --maximum-memory=700000 --per-function-maximum-memory=350000 --threads=36 --no-semantics v2g.exe

I am running in a pretty decent server, the memory and CPU parameters are correct. The v2g.exe is 12MB as reported by ls. I'm currently seeing this:

image

It's been stuck at 94% for around 1 hour. The time it took to output the Partitioned 7410971 bytes... line was about 3 minutes, so I was expecting this to finish faster. This is what I see with docker ps:

image

It seems as if there was a memory limit at 16GiB set somewhere. You can see that there are 700GiB+ available memory, and I set this limit on the partition call as well. Is this expected?

sei-eschwartz commented 2 years ago

I do not believe that partitioning is usually that memory intensive, so you are probably seeing slow code rather than memory constrained code. But @sei-ccohen can weigh in more authoritatively there.

If your file is not malware, you can try --partition=rose. Based on your output, something in Pharos' custom partitioning is very slow for some reason, but ROSE partitioning is relatively fast.

sei-ccohen commented 2 years ago

When the partitioner gets "stuck" near the end of the partitioning like this, it's an unfortunate Big-O performance problem. The details are complicated, but the code is CPU bound, but it is still making progess so start by being very patient. It's not a truly endless loop, but it can run for a very long time. In my experience, the problem is caused by executables that have the pattern: data, code, data, code, data, code. Each time our algorithm shifts from data to code and back again, there's a very expensive call to a ROSE partitioning function that is proportional to the size of the entire program, which is what produces the unacceptable performance, and there's no easy solution to it without some changes to ROSE.

This problem is 100% caused by the Pharos partitioning algorithm calling that ROSE function in an unintended way, so switching to the ROSE partitioner (--partitioner=rose) will absolutely make the performance problem go away. It will also fail to detect all of the functions in the code, data, code, data section of the program. If it's the case that this is not really code (or at least not important code) then using the ROSE partitioner is fine, but if the code is real and important you really have no choice but to wait for the algorithm to complete.

I've seen this 90-100% step take as long as a couple of hours. :-(

sei-ccohen commented 2 years ago

I should add that it's normal for this step to consume no memory. At some abstract level, the ROSE function that is consuming all of the CPU is just re-analyzing the entire program (and confirming that nothing has changed) each time we call it. So it uses lots of CPU and consumes no additional memory on each call.

jsrolon commented 2 years ago

Thank you! This is not malware, so I went ahead with rose:

image

This is good. I continued step 2 with ooanalyzer and had a few errors, but I don't think they were anything to be alarmed about. I verified the facts output and it has a decent number of facts of the kinds listed in the step-by-step guide.

I ran step 3 afterwards, but I decided to kill it after it had been running for 24 hours writing things to the log, I didn't see anything indicating that it was progressing. Is this normal as well?

sei-eschwartz commented 2 years ago

Was it still writing to the log? If so, it was probably still making progress. 24 hours is a long time, but 12 MiB is a very large executable. I've seen one or two executables where the prolog stage takes >24 hours. If you share the facts file, or the last ~2000 lines of the prolog log file, I can comment further.

sei-ccohen commented 2 years ago

It might be fine that you reran this with --partitioner=rose. It certainly allowed you to move on to seeing whether there were other OOAnalyzer problems or not. Completing in slightly longer than it took to "get stuck" was expected. Unfortunately, the --partitioner=rose issue isn't really about whether it's malware or not. That question really determines whether you can use --no-semantics or not (which also speeds up partitioning). The default OOAnalyzer options incclude --semantics, which will do a better job of recovering obfuscated control flow and that's very important for some malware. If it's not malware, using --no-semantics can make partitioning quite a bit faster, and will probably have no effect on the result.

As for --partitioner=rose, fixing the horrible performance is not without consequences. :-( The most recent run made code and data from 7,719,350 bytes. The previous run (when you took your screenshot) had made code and data from 8,411,022 bytes. and was presumably still advancing occasionally. What was in the additional ~700K? I don't know, but it might be worth investigating. If you have your serialization file it should be reasonably quick to run the dumpmasm tool and see which functions were found (and which were not). That should help clear up what memory region the partitioner was struggling with (what region still has no functions). It might be that that Pharos was doing something stupid, and you're happy to be rid of those incorrect functions. Or it might be that it was slogging through making additional OO functions that weren't obviously found by analyzing the control flow. It's difficult to know without some more investigation.

Maxxxel commented 2 years ago

I have an .exe of 26 MB size.... is it impossible for me to analyze? I have 32 Gig Memory and 2 TB Storage. When i run it it goes to 16 GB RAM and then crashes -> "Killed". Cant i use some of my HDD for the tool even tho its slower than RAM?

sei-eschwartz commented 2 years ago

@Maxxxel You can try to use virtual memory. "Killed" means that your kernel is killing the process because it is running out of memory. If you are running in Docker, make sure that you don't limit the amount of RAM available to the containers.

Maxxxel commented 2 years ago

Im using a .wslconfig file. How do i use Virtual Memory?

Edit: i've set the Memory to a value bigger than my available RAM now, and the Tool uses 99.99% of my RAM but it runs at least! Edit#2: Not i got killed again, out of memory, at 49 %. Rose took 120 seconds.

sei-eschwartz commented 2 years ago

This sounds like a WSL configuration problem. Maybe open a new issue and include the command line you are running.

YamamotoKaderate commented 1 year ago

What worked for me in the end for .wslconfig:

swap=245GB

Sets swapfile path location, default is %USERPROFILE%\AppData\Local\Temp\swap.vhdx

Swap on an SSD if you want decent performances

swapFile=H:\temp\wsl-swap.vhdx

desactiver cgroups v1 https://stackoverflow.com/questions/73021599/how-to-enable-cgroup-v2-in-wsl2

kernelCommandLine = cgroup_no_v1=all

proposition https://github.com/moby/moby/issues/4250

cgroup_enable=memory swapaccount=1

cgroup_disable=memory

you must restart the wsl service or your computer

Maxxxel commented 1 year ago

Thanks I'll try that. I only have SSDs for years already 😁

sei-eschwartz commented 1 year ago

I believe this is resolved.

Maxxxel commented 1 year ago

Not resolved but discontinued. Found out that there have been issues before (Battleforge.exe) and found a file made by you guys in the issues, so I took that 😁