Prolog analysis on 64bit pe executables results in no data

jubjub727 commented 2 years ago

I've been running through the large binary guide with a 95MB binary for a few days now. I'm at the prolog analysis step and it runs using 478GB of RAM but then it returns no results. The facts generated by ooanalyzer seem good. I ran a 64bit test executable that I made with references to a static library the target executable uses through the steps and it resulted in the same output. However an older 32bit executable that I ran through worked perfectly and had the desired results. What am I missing here? Is there some way to get this to work or is it not possible?

chrome_jl7w3WAxAJ

Prolog Analysis Log:

Guessing is enabled.
Profiling is disabled.
RTTI is enabled.
Reasoning about object oriented constructs based on known facts ...
RTTI was not present.
Entering stage 'Initial reasoning'.
reasonForwardAsManyTimesAsPossible
Starting reasonForward.
Entering stage concludeMethod.
Entering stage concludeVFTableOverwrite.
Entering stage concludeConstructor.
Entering stage concludeNOTConstructor.
Entering stage concludeVFTable.
Entering stage concludeNOTVFTable.
Entering stage concludeVFTableWrite.
Entering stage concludeVFTableEntry.
Entering stage concludeNOTVFTableEntry.
Entering stage concludeVFTableSizeGTE.
Entering stage concludeVFTableSizeLTE.
Entering stage concludeVBTable.
Entering stage concludeVBTableWrite.
Entering stage concludeVBTableEntry.
Entering stage concludeObjectInObject.
Entering stage concludeDerivedClass.
Entering stage concludeNOTDerivedClass.
Entering stage concludeEmbeddedObject.
Entering stage concludeNOTEmbeddedObject.
Entering stage concludeDeletingDestructor.
Entering stage concludeRealDestructor.
Entering stage concludeNOTRealDestructor.
Entering stage concludeClassSizeGTE.
Entering stage concludeClassSizeLTE.
Entering stage concludeClassHasNoBase.
Entering stage concludeClassHasUnknownBase.
Entering stage concludeReusedImplementation.
Entering stage concludeNOTMergeClasses.
Entering stage concludeTrigger.
Entering stage concludeMergeVFTables.
Entering stage concludeMergeClasses.
Entering stage concludeClassCallsMethod.
Entering stage concludeClassRelatedMethod.
reasonForwardAsManyTimesAsPossible complete.
Entering stage 'Initial reasoning complete'.
Making new hypothetical guesses ...
reasoningLoop: guess
Starting guess. There are currently 0 guesses.
Entering stage guessVirtualFunctionCall.
Entering stage guessVFTable.
Entering stage guessVBTable.
Entering stage guessDerivedClass.
Entering stage guessMethod.
Entering stage guessVFTableEntry.
Entering stage guessDeletingDestructor.
Entering stage guessRealDestructor.
Entering stage guessNOTConstructor.
Entering stage guessConstructor.
Entering stage guessNOTMergeClasses.
Entering stage guessClassHasNoBase.
Entering stage guessUnlikelyConstructor.
Entering stage guessMergeClassesB.
Entering stage guessMergeClassesC1.
Entering stage guessMergeClassesC2.
Entering stage guessMergeClassesC3.
Entering stage guessMergeClassesC4.
Entering stage guessMergeClassesD.
Entering stage guessMergeClassesG.
Entering stage guessFinalDeletingDestructor.
Entering stage guessLateMergeClassesF1.
Entering stage guessLateMergeClassesF2.
Entering stage guessLateMergeClasses_G1.
Entering stage guessLateMergeClassesG.
Entering stage guessCommitClassHasNoBase.
reasoningLoop: There are no possible guesses remaining
No plausible guesses remain, finalizing answer.
Entering stage reportResults.
Guessed 0 methods of 0, guessed contrary conclusions: 0 of 0
Guessed 0 constructors of 0, guessed contrary conclusions: 0 of 0
Guessed 0 destructors of 0, guessed contrary conclusions: 0 of 0
Guessed 0 deleting destructors of 0, guessed contrary conclusions: 0 of 0
Guessed 0 virtual function calls of 0, guessed contrary conclusions: 0 of 0
Guessed 0 virtual function tables of 0, guessed contrary conclusions: 0 of 0
Guessed 0 virtual base tables of 0, guessed contrary conclusions: 0 of 0
Guessed 0 virtual function table entries of 0, guessed contrary conclusions: 0 of 0
Guessed 0 derived classes of 0, guessed contrary conclusions: 0 of 0
Guessed 0 embedded objects of 0, guessed contrary conclusions: 0 of 0
Guessed 0 has a base class of 0, guessed contrary conclusions: 0 of 0
Object oriented analysis and reporting complete, exiting.

sei-eschwartz commented 2 years ago

64-bit executables aren't supported, and you shouldn't expect to get even decent results on them. You should have had to specify an --allow-64-bit option at some point to acknowledge this.

That being said, I have no idea why you would get zero conclusions like you are. Are you sure you are running ooprolog on the right fact file? If you can share your fact file, I can take a look at it. But again, I would not expect much for 64-bit executables.

cfcohen commented 2 years ago

I thought I would provide a little more detail on why 64-bit executables are not supported, while 32-bit executables work much better. The primary problem is related to calling conventions. In the 32-bit ABI, the thiscall calling convention is fairly easily distinguished from the more common cdecl calling convention, based on passing the object-pointer in the ECX register. This allows us to easily distinguish OO functions from non-OO functions based on the usage of the ECX register. There's minor confusion with the fastcall convention, but there are some tricks we use to distinguish those from thiscall.

In the 64-bit ABI, the standard calling convention passes most parameters in registers and there's no real distinction between OO and non-OO functions. This leads to quite a bit of confusion in trying to "find OO features" in functions that aren't actually OO functions in the first place, as well as the creation of incorrect OO facts from functions that coincidentally look like OO functions, but in fact, are not. This problem is fairly significant and would require a more sophisticated way of deciding which functions are OO, a decision that gets made before the Prolog facts are exported in our current system.

We discussed various approaches to fixing this problem, and I really do believe that if we could get over this problem, the rest of the system probably isn't too far from working correctly, but in practice, the inability to correctly identify OO functions causes a complete failure to produce anything useful. :-(

jubjub727 commented 2 years ago

I wasn't expecting full results, more just hoping that enough info would be found to drastically decrease the amount of reverse engineering of classes required to achieve my goals. Just a few luckily placed bits of OO info would significantly lower the amount of time needed for reversing engineering.

Here's a link to the facts (615MB): https://mega.nz/file/SNVFgJCa#jqeAzb3Uj4PNPAZvugzy0KrKbnKKSzWlg_vakubwA-c

I understand about the calling conventions, I'd need to find a way to mark functions as OO despite them sharing calling convention with non OO functions. I see in the email that you mentioned tracking data flow from known OO functions. What specifically do you mean about the data flow? I'm currently dealing with carpal tunnel in both hands so I'm limited in how much I can do until I get surgery but I'm willing to create the changes needed to achieve partial 64bit results. Be the change you want to see in the world and all and I want a world with 64bit OOAnalyzer :P

Also about any performance concerns with running such a large binary, my solution currently is to just throw more cloud resources at it and wait days lol.

cfcohen commented 2 years ago

By tracking the data flow, what I mean was that the value returned by new() is clearly an object pointer. Therefore when that value is passed in RCX to a function, that function is probably truly an OO function. When that same pointer is passed to another function in RCX from that first function, it too is probably truly an OO function. And so on. The new() part of the problem that may need to be done in C code.

Some of those same conclusions could probably be reached by connecting the callParameter, callReturn, callTarget, callingConvention, funcParameter and funcReturn facts. You appear to have quite a few of these facts, although you might not have all of the ones that you should if some functions got excluded because they looked more like 32-bit fastcall than thiscall.

Documentation for these facts can be found here: https://github.com/cmu-sei/pharos/blob/master/share/prolog/oorules/facts.pl

I'm not going to try downloading the facts file because it's quite large and frankly unlikely to be a good starting test case for solving the 64-bit problem in general (because it's so big and complicated). But you might make some progress by using some of the facts selectively to update an IDA or Ghidra database. For example, there's a chance that the 53,575 possibleVFTableWrite facts are correct, and so you might be able to find many of the VFTables in an automated way by processing that file. The possibleVirtualFunctionCall facts also have a reasonable chance of being correct.

The fact that there are only 8356 returnsSelf facts may be part of why the current Prolog code is failing so badly. In 32-bit code, constructors are expected to load the ECX register into EAX before returning. I'm not sure if that's true in 64-bit code. If the Prolog phase doesn't get started at finding some things that it is confident about, it will fail to find anything more after that.

jubjub727 commented 2 years ago

Ah yep that makes sense.

So the method for determining if a function most likely is OO would be to cache the object pointer from a known OO function then because "this" is passed through RCX it's possible to assume that any functions taking that object pointer in RCX are OO functions apart of the class for that object pointer. Which of course will result in any non OO functions that take a class instance as a first argument to falsely be classified as OO but that shouldn't be incredibly common with modern C++ code.

I'm not misunderstanding anything here?

I've been having a look through the source but I'm not too familiar with it yet, what sort of time investment do you think a rough implementation would take to get to the prolog rules? It doesn't sound too daunting to implement logic achieving that classification, I guess solving it in a way that's performant and testing it is the difficult part.

I have a 1.4MB binary that implements a static library that's used in my target binary. Do you think that would be a good test binary or should I create something smaller? It also resulted in the same 0 results as my target binary.

cfcohen commented 2 years ago

Your understanding of what I was saying about RCX is correct (including the false classifications).

There are already some Prolog rules for determining whether a method is confirmed to be an OO method. That code is here:

https://github.com/cmu-sei/pharos/blob/master/share/prolog/oorules/rules.pl#L6

The problem is probably that most of those conclusions require other facts to already be true, and something is apparently blocking us from reaching any conclusions about 64-bit code. So most of those probably aren't getting used (yet). Rules _E and _F might fire on 64-bit code because they're derived from exported symbols. Rules _O and _P are particularly suspect for maybe not being correct in 64-bit code.

1.4MB is probably still pretty large. There's a set of ootest programs that ship with OOAnalyzer. If I were working on this, I'd probably take one of those tiny programs, look at what it concluded for the equivalent 32-bit program, look at what it concluded (or didn't) for the 64-bit program, and try to start tracking down what went wrong. Again, this is not a small task, but I can continue to point you in the right direction if you're making progress.

jubjub727 commented 2 years ago

Since the project I'm using this for is already requiring me to make a decompiler which I've never done (for a rather simple instruction set luckily) I think I might try some quick and dirty approaches first in case something works. If that fails then I'll try and get a proper 64 bit solution working. Or I'll reverse just enough to allow my project to continue then come back to this problem to get a more complete and future proof solution.

cfcohen commented 2 years ago

That's a reasonable plan. For a "quick and dirty" approach, I'd probably try writing a Python/Java script to analyze the VFTables from the possibleVFTableWrite facts, and see what you can learn from there. Good luck!

sei-eschwartz commented 1 year ago

I'm going to close this but feel free to reach out if you need any assistance from us.

cmu-sei / pharos

Prolog analysis on 64bit pe executables results in no data #220