glmcdona / Process-Dump

Windows tool for dumping malware PE files from memory back to disk for analysis.
http://split-code.com/processdump.html
MIT License
1.65k stars 262 forks source link

Walkthrough for restoring OEP and IAT for dumped executables? #15

Open TAbdiukov opened 5 years ago

TAbdiukov commented 5 years ago

Hello,

I'm trying to dump the packed executable, and among other things, I encounter OEP set to 0x00000000 and IAT messed up. I currently do the following,

  1. Close all apps
  2. pd -db genquick
  3. Run my target
  4. pd -pid <pid>

The dumper dumps the best possible, sure; but is there a way to restore the OEP (so I can run the executable) and IAT (run anywhere else aside from the VM)? Thanks heaps <3

One suggestion I came up with inspired by https://reverseengineering.stackexchange.com/a/11272 Since the dump stores the IAT that was present at a runtime, I can either find the imports string representation in the dump (if present, which is always True in my case) or listen to the program's API calls. Either way, I do not get how can I translate the API call names to their static addresses. Any help will be appreciated

biggestsonicfan commented 4 years ago

Any update to this issue?

TAbdiukov commented 4 years ago

@cglmrfreeman also interested

glmcdona commented 4 years ago

The current behavior right now varies on the type of resource being dumped.

There are two main memory dump scenarios supported by ProcessDump:

  1. In-memory PE files
  2. Loose executable regions in memory that have no PE header

Given that you observed an OEP of 0x00000000, I suspect ProcessDump may be dumping an in-memory PE file where the packer has deliberately wiped the entry point.

How it handles these two are a bit different:

OEP and dumping of in-memory PE files

If it's an in-memory PE module being dumped from memory, it keeps the original OEP that is specified in the PE header in-memory. This is usually right but unfortunately in the case of packed files sometimes this may no longer be a correct entry point or in some cases the packer may deliberately wipe the OEP.

Reconstructing this can be challenging guess-work depending on what the executable code looks like. Unwinding the threads might work sometimes, but only if the entry thread is still active. Otherwise I don't think there is a great way to determine it without manual research work per file that I can think of, but am open to suggestions.

OEP and dumping of loose executable regions in memory

With a loose executable region in memory, ProcessDump creates it's own PE header and import table so that it can be analyzed. Unfortunately there isn't a great way to reconstruct where the original entry point might have been into it, so it just sets an entrypoint to the very start of the region - which is sometimes right.

IAT reconstruction

The IAT reconstruction (or construction in the case of loose executable regions) method in ProcessDump is a stronger more aggressive approach than that referenced stackoverlow approach. I've found it to be quite successful. If you have an example malware file, maybe share the file Sha256 and I might be able to have a look why it didn't build the IAT correctly?

Here's how it works:

  1. At dump time it looks at all loaded modules in the process, enumerating all the export addresses in the process. Call these ExportAddresses[].
  2. When dumping a code region or PE file from memory, it:
  3. Finds all possible references everywhere in the dumped module to these ExportAddresses[] by a raw search for any dword/qword matches. All of these matches throughout the whole process are then used to create the new IAT.
  4. Increases the size of the last section in the PE file being dumped.
  5. Creates a new IAT that links to all the scattered dword/qwords to link them to the respective Library+ExportNames.

The advantage of this, is that it not only corrects the IAT, but it also fixes up any reference to a library export anywhere in the code. For example, if they had custom code which resolves their library addresses and saving it to global variables, like:

RE: is there a way to restore the OEP (so I can run the executable) and IAT (run anywhere else aside from the VM

Generally a memory dump from a running process will often not successfully run. Consider an example like this:

` global void* myFileHandle;

function myFunction() { if(myFileHandle == null) { myFileHandle = ... create a real handle to a file mapping or something .. }

... now use myFileHandle ... } `

In the memory dump, the myFileHandle will be saved from the running process. When you try to run the executable again, the process will think that is a valid handle, when in fact it is no longer a valid handle. There are a lot of challenges like this that will mean most of the time you can't re-execute a dumped from-memory component cleanly. It is great for static analysis though :)

Hope this helps! I'm open to suggestions, and especially pull requests as well.

biggestsonicfan commented 4 years ago

Here's how I see it, and feel free to correct me where I'm wrong, because I most certainly might be:

The program takes a packed executable, and correctly creates an unpacked executable. Alright, so we have both variables. The question is, at what instruction did we first enter unpacked territory? Is there no way for the program to take what it had, what it now has, and run a secondary scan indicating where the first instructions run from an unpacked executable in memory?

There are methods to reconstruct the IAT based on having the knowledge of where the OEP is, however I have been very unsuccessful in finding the OEP of unpacked territory for a particular EXE, where Process-Dump seems to have no issue creating an unpacked EXE. If Process-Dump could somehow give a hint about the OEP for OllyDbg users (or debuggers of that type), then manual reconstruction could and would become much easier.

Based on the current version of Process-Dump, I can't seem to gather any information of how my particular EXE is unpacked or what instruction Process-Dump decided to dump the process at. I could be very wrong in assuming how this works here, but again feel free to explain why I'm wrong.

glmcdona commented 4 years ago

So one idea that will work for some binaries is creating a set of known entry point pattern signatures - sort of like a FLIRT signature.

Idea:

This might be able to reconstruct the OEP in some cases like this, and would be fairly easy to implement.

glmcdona commented 4 years ago

Added in change f4de0591dd84fe5a5c4ef8cc2b4266da9ddb5e48.

Would you be able to have a look? As a note, I found during building this feature that most modules that have an entry point of 0 are actually DLL libraries that don't have an entry point specified (just exported functions). It is likely the malware you are looking at is a DLL and may not have ever had a defined entry point.

How to try it out:

  1. Download and build the latest binary.
  2. Build the clean hash database. It will now also create two new databases: "entrypoints.hashes" and "shortentrypoints.hashes". This involves two hashes (8 bytes at entrypoint, and hashing of between 30 and 100 opcodes disassembling from the entrypoint).
  3. Now when you dump from memory, if the OEP is 0 or invalid, it will attempt to reconstruct the entry-point based on these databases of known formats for entrypoints.

To debug it there are some flags you can enable "-v" in the command line to get it to log detailed information on if it tried to reconstruct the entrypoint and what it found. It should show something like:

INFO: Re-building entrypoint. Original entrypoint invalid: 0 INFO: Possible entrypoint found (weak): 1040 INFO: Possible entrypoint found (weak): 1045 INFO: Possible entrypoint found (weak): 1998 ... INFO: Possible entrypoint found (weak): 1ac10 INFO: Possible entrypoint found (strong): 1ac10 INFO: Possible entrypoint found (weak): 1afe0 INFO: Updated entrypoint to: 1ac10

A weak entrypoint is that the first 8 bytes matched a known entrypoint, this triggers a full disassembly to check the strong hash. It will use the first strong hash match as the entrypoint, and if no strong match is found it will use the first weak match.

You can use '-eprec' to get it to force reconstruct the entrypoint of every module it dumps. Really helpful for testing!

biggestsonicfan commented 4 years ago

I can't seem to get the '-eprec' flag to work? It keeps saying "Failed to parse argument"

biggestsonicfan commented 4 years ago

Got the flag working, but unfortunately Windows does not like the EP that was assigned. And I get different results each time the unpacked and restored EP executable is run:

Screenshot_20201004_161351 Screenshot_20201004_161427 Screenshot_20201004_161503 It may just be that this method will be incapable of restoring the EP for this particular application.

EDIT: Exactly what determines how many times the EP should be attempted? It seemed like only 8 entry points were attempted, all weak, and the program decided to use the first one from the list?

biggestsonicfan commented 3 years ago

Anything at all? Has the '-eprec' ever successfully worked on any samples?

biggestsonicfan commented 1 year ago

Coming back after quite a while to see if anything has happened in the last year or so.

biggestsonicfan commented 1 year ago

I realize it's been a very very long time, but I finally restored the OEP and IAT for my unpacked executable.

If anyone in the future finds this, "Process-Dump" is currently not compatible with ASProtect.