Vector35 / binaryninja-api

Public API, examples, documentation and issues for Binary Ninja
https://binary.ninja/
MIT License
927 stars 209 forks source link

BN not creating data variables for MSVC vtables on a particular binary #4375

Closed alexrp closed 1 year ago

alexrp commented 1 year ago

Version and Platform (required):

Bug Description: See the attached screenshots. The highlighted memory is obviously a pointer to an RTTI Common Object Locator, yet BN is not even creating a data variable here. Same for the function pointers of the vtable immediately following the COL pointer. This is odd, because BN has no trouble doing so in any other binary I've tried.

Steps To Reproduce:

  1. Open this binary.
  2. Go to any vtable (see screenshots).

Expected Behavior: Data variables should be created in these cases.

Screenshots: Screenshot from the attached binary: image Screenshot from another arbitrary binary: image

CouleeApps commented 1 year ago

RTTI parsing and VTable structure creation is limited to only binaries with PDBs at this point. Otherwise, data variables are created based on the exe symbol list and xrefs. This might be a feature request dupe of #3930

alexrp commented 1 year ago

This doesn't actually have anything to do with PDBs or RTTI per se, as far as I know. This is an issue with basic analysis.

The binary I took the second screenshot from has no PDB, for example.

alexrp commented 1 year ago

Some more context: I ran into this issue because ClassyPP was completely unable to locate any RTTI in this binary, even though it has tons of it. The reason is because BN isn't creating the basic void*-typed data variables for the COL pointers and vtable entries that ClassyPP is expecting to exist.

psifertex commented 1 year ago

If no uses are observed, function pointers are not created -- the good news however is that the issue tracking adding support for this is scheduled for the current release: https://github.com/Vector35/binaryninja-api/issues/1189

If I understand you correctly, once that feature is implemented and these pointers are created, then the ClassyPP plugin should be successful.

alexrp commented 1 year ago

If no uses are observed

That's the thing, though. In the binary in the 2nd screenshot, there are no xrefs to the vast majority of those function pointer data variables (only to the first one in each vtable). Yet BN almost immediately creates these data variables when loading the binary.

psifertex commented 1 year ago

There are several other things that can create them. BinaryViews can, DebugInfo (in the case of debug information being in the file), and probably others. Without that other file to compare to it's hard to say what caused those to be created in that particular case, but it's not necessarily a lack of that same thing that is failing you in this file.

If I understand it correctly then, the temporary work-around is to simply select the list of pointers and hit o on them which should allow ClassyPP to work on those offsets from your previous description. Actually yeah -- just tested it and that appears to work:

Screenshot 2023-06-01 at 1 52 16 PM

The near-term issue here is that the pointers aren't being created, something which is tracked in #1189 so I don't think there's any unique issue we're tracking here and I plan to close as a duplicate unless I've missed something.

alexrp commented 1 year ago

Without that other file to compare to it's hard to say what caused those to be created in that particular case, but it's not necessarily a lack of that same thing that is failing you in this file.

It would be great if we could at least figure out why this particular binary is not getting data variables created. It's worth noting that the binary has been unpacked (Themida) and partially fixed up, so it might be somewhat 'different' from a normal PE. It could be that some aspect of that trips up BN's analysis.

The binary in the 2nd screenshot probably can't be shared here for copyright reasons, but I tried ConEmu64.exe instead, which also gets data variables created as expected:

image

psifertex commented 1 year ago

Sorry for the delay, I had to get a working dev build setup again to run this down. The short answer is that ConEmu causes those pointers to be created there because relocations exist for them which was basically exactly what I mentioned above as one of the reasons: specifically BinaryView creation or some other hint that there exists a pointer there.

Given that I'm going to go ahead and close this as it is indeed expected behavior (given the other issue) but hopefully we'll get that new pointer heuristic on dev soon and it will resolve it so you don't have to manually create the pointers!

If it turns out there's some relocations that do exist on your other binary that are being missed that would be a distinct issue from the pointer heuristic but I don't have any reason to believe that's the case right now.

alexrp commented 1 year ago

Sorry for the delay, I had to get a working dev build setup again to run this down.

No worries.

The short answer is that ConEmu causes those pointers to be created there because relocations exist for them which was basically exactly what I mentioned above as one of the reasons: specifically BinaryView creation or some other hint that there exists a pointer there.

That totally makes sense. Thanks for investigating.

If it turns out there's some relocations that do exist on your other binary that are being missed that would be a distinct issue from the pointer heuristic but I don't have any reason to believe that's the case right now.

Hmm, they... sort of exist?

I think the issue is that the relocation data directory points to the wrong section: !reloc, which is a tiny (20 bytes) section associated with Themida, which BN obviously can't get much value out of. The real .reloc section is there, but BN just sees it as some arbitrary data section. I'll need to see if I can repair this stuff, but in any case, I don't think there's anything BN could reasonably be expected to do differently here.

alexrp commented 1 year ago

I think the issue is that the relocation data directory points to the wrong section: !reloc, which is a tiny (20 bytes) section associated with Themida, which BN obviously can't get much value out of. The real .reloc section is there, but BN just sees it as some arbitrary data section. I'll need to see if I can repair this stuff, but in any case, I don't think there's anything BN could reasonably be expected to do differently here.

Just for the record, I repaired the data directory to point to the real .reloc section, and now BN resolves all those function pointers as expected. 🙂

psifertex commented 1 year ago

Thanks for the update, glad to hear it! The heuristic recovery should also work once that is complete but this makes sense in the meantime for that particular situation.