radareorg / radare2

UNIX-like reverse engineering framework and command-line toolset
https://www.radare.org/
GNU Lesser General Public License v3.0
19.85k stars 2.95k forks source link

META - vtable detection for C++, ObjectiveC, Dlang and Swift binaries #17134

Open pinkflawd opened 7 years ago

pinkflawd commented 7 years ago

What should be done (in C, as part of radare2):

See code at

Examples of similar features in another programs:


Attached binary is malware written in C++, lots of vtables, not detected with av command. password is infected

banito.zip password: "infected" 2018-asplos.pdf

Maijin commented 7 years ago

Current code is here https://github.com/radare/radare2/blob/master/libr/anal/vtable.c

Maijin commented 7 years ago

See implementation here https://github.com/REhints/HexRaysCodeXplorer

image

pinkflawd commented 7 years ago

This blog describes how MS VC++ works quite well http://www.openrce.org/articles/full_view/23

The problem with MS VC++ binaries is, that they don't necessarily come with information on their class structure, there is just something called RTTI, which, at least for malware, is almost always stripped. Finding vtables within the binary is done either sweeping the code section for .. well things that look like vtable structures, or, as done by CodeExplorer, starting from the code and searching for constructors and hoping to find a vtable offset within the arguments. Neat would be, trying to recover not only vtables, but entire object structures. This would require lots of constructor detection, parsing, and I think guessing, though.

I'm sure there are more thorough ways of doing this, will think about it and update this thread.

XVilka commented 7 years ago

I'm renaming this issue in the [META] for all vtables and C++ metainformation.

XVilka commented 7 years ago

@codeuchiha for your reference

PankajKataria commented 7 years ago

Thank you @XVilka @pinkflawd @Maijin for providing resources, however I am stuck at parsing RTTI for GCC it will very helpful if some one can point out resources for that too. @pinkflawd yes RTTI (runtime type infromations) structures are present in both cases(MVSC or GCC), it's a way of storing class information and it's inheritance hierarchy by the compiler. I have described the approach we used to find virtual tables for elf files here : https://goo.gl/CDDEI5

pinkflawd commented 7 years ago

Cool, thanks for the link! As for GCC RTTI or RTTI in general I know this presentation http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf - contains some info. A note though, not sure how it is with benign binaries, but malware doesn't usually come with RTTI information.

PankajKataria commented 7 years ago

@pinkflawd I totally agree that the binary doesn't always comes with RTTI structures, but we are first aiming to develop for the case where RTTI is present and then will keep improving including different scenarios and Thank you for the link.

pinkflawd commented 7 years ago

sure sure :)

XVilka commented 6 years ago

Also a scripts from our lovely Binary Ninja https://github.com/trailofbits/binjascripts/tree/master/vtable-navigator

XVilka commented 6 years ago

Adding one more for vtables/C++ metainfo parsing - https://github.com/igogo-x86/HexRaysPyTools

awhawks commented 6 years ago

This from REcon 2011 look useful as well - Practical C++ Decompilation

XVilka commented 6 years ago

See also https://llvm.org/docs/HowToSetUpLLVMStyleRTTI.html

XVilka commented 6 years ago

Those are for Swift:

XVilka commented 6 years ago

This one is for ObjectiveC http://cocoasamurai.blogspot.com/2010/01/understanding-objective-c-runtime.html

XVilka commented 6 years ago

New plugin by NCC Group https://github.com/nccgroup/PythonClassInformer https://www.nccgroup.trust/us/about-us/newsroom-and-events/blog/2017/october/python-class-informer-an-idapython-plugin-for-viewing-run-time-type-information-rtti/

XVilka commented 6 years ago

See also vtables/RTTI parsing from RetDec https://github.com/avast-tl/retdec/tree/master/src/bin2llvmir/optimizations/vtable

radare commented 6 years ago

See https://github.com/bazad/memctl

XVilka commented 6 years ago

Also https://github.com/cocoahuke/maclook4ref

XVilka commented 6 years ago

See also https://github.com/RUB-SysSec/Marx thanks to @radare Paper is here https://www.syssec.rub.de/media/emma/veroeffentlichungen/2016/12/22/marx_ndss2017.pdf

Maijin commented 6 years ago

https://github.com/whitequark/binja_itanium_cxx_abi

XVilka commented 6 years ago

Future implementers - note, that current code in libr/core/anal_vt.c is flawed since it targets x86 only. The detection should be crossplatform.

XVilka commented 6 years ago

See also https://alschwalm.com/blog/static/2016/12/17/reversing-c-virtual-functions/ + https://alschwalm.com/blog/static/2017/01/24/reversing-c-virtual-functions-part-2-2/

XVilka commented 6 years ago

See also more in https://www.blackhat.com/presentations/bh-dc-07/Sabanal_Yason/Paper/bh-dc-07-Sabanal_Yason-WP.pdf

XVilka commented 6 years ago

And also http://blog.httrack.com/blog/2014/05/09/a-basic-glance-at-the-virtual-table/ for GCC

thestr4ng3r commented 6 years ago

I will try to do msvc soon.

Maijin commented 6 years ago

@thestr4ng3r @XVilka @r00tus3r can you check the boxes in the initial post of the issue so we know what is remaining to do here?

XVilka commented 6 years ago

@sivaramaaa I think along with types information loading from PDB and DWARF we can load C++ classes as structures at first. Assigning you to think about how it can be done the fastest/easiest way.

@Maijin done, we need to add more tests into r2r for this.

XVilka commented 5 years ago

Added this paper https://www.cs.tau.ac.il/~maon/pubs/2018-asplos.pdf

XVilka commented 5 years ago

See also https://github.com/0xgalz/Virtuailor tool for creating automatic C++ virtual tables in IDA Pro based on the runtime information.

XVilka commented 5 years ago

See also (Pharos) https://edmcman.github.io/papers/ccs18.pdf + https://edmcman.github.io/pres/ccs18.pdf

https://github.com/cmu-sei/pharos

XVilka commented 5 years ago

DeClassifier: Class-Inheritance InferenceEngine for Optimized C++ Binaries 1901.10073.pdf

XVilka commented 4 years ago

One more implementation: https://github.com/astrelsky/Ghidra-Cpp-Class-Analyzer

HoundThe commented 3 years ago

I guess I can tick, ASCII inheritance graph with this PR https://github.com/radareorg/radare2/pull/17362

XVilka commented 3 years ago

See also these for extracting C++ classes information from kernelcache: