Open pinkflawd opened 7 years ago
Current code is here https://github.com/radare/radare2/blob/master/libr/anal/vtable.c
See implementation here https://github.com/REhints/HexRaysCodeXplorer
This blog describes how MS VC++ works quite well http://www.openrce.org/articles/full_view/23
The problem with MS VC++ binaries is, that they don't necessarily come with information on their class structure, there is just something called RTTI, which, at least for malware, is almost always stripped. Finding vtables within the binary is done either sweeping the code section for .. well things that look like vtable structures, or, as done by CodeExplorer, starting from the code and searching for constructors and hoping to find a vtable offset within the arguments. Neat would be, trying to recover not only vtables, but entire object structures. This would require lots of constructor detection, parsing, and I think guessing, though.
I'm sure there are more thorough ways of doing this, will think about it and update this thread.
I'm renaming this issue in the [META] for all vtables and C++ metainformation.
@codeuchiha for your reference
Thank you @XVilka @pinkflawd @Maijin for providing resources, however I am stuck at parsing RTTI for GCC it will very helpful if some one can point out resources for that too. @pinkflawd yes RTTI (runtime type infromations) structures are present in both cases(MVSC or GCC), it's a way of storing class information and it's inheritance hierarchy by the compiler. I have described the approach we used to find virtual tables for elf files here : https://goo.gl/CDDEI5
Cool, thanks for the link! As for GCC RTTI or RTTI in general I know this presentation http://www.hexblog.com/wp-content/uploads/2012/06/Recon-2012-Skochinsky-Compiler-Internals.pdf - contains some info. A note though, not sure how it is with benign binaries, but malware doesn't usually come with RTTI information.
@pinkflawd I totally agree that the binary doesn't always comes with RTTI structures, but we are first aiming to develop for the case where RTTI is present and then will keep improving including different scenarios and Thank you for the link.
sure sure :)
Also a scripts from our lovely Binary Ninja https://github.com/trailofbits/binjascripts/tree/master/vtable-navigator
Adding one more for vtables/C++ metainfo parsing - https://github.com/igogo-x86/HexRaysPyTools
This from REcon 2011 look useful as well - Practical C++ Decompilation
This one is for ObjectiveC http://cocoasamurai.blogspot.com/2010/01/understanding-objective-c-runtime.html
See also vtables/RTTI parsing from RetDec https://github.com/avast-tl/retdec/tree/master/src/bin2llvmir/optimizations/vtable
See also https://github.com/RUB-SysSec/Marx thanks to @radare Paper is here https://www.syssec.rub.de/media/emma/veroeffentlichungen/2016/12/22/marx_ndss2017.pdf
Future implementers - note, that current code in libr/core/anal_vt.c is flawed since it targets x86 only. The detection should be crossplatform.
I will try to do msvc soon.
@thestr4ng3r @XVilka @r00tus3r can you check the boxes in the initial post of the issue so we know what is remaining to do here?
@sivaramaaa I think along with types information loading from PDB and DWARF we can load C++ classes as structures at first. Assigning you to think about how it can be done the fastest/easiest way.
@Maijin done, we need to add more tests into r2r for this.
Added this paper https://www.cs.tau.ac.il/~maon/pubs/2018-asplos.pdf
See also https://github.com/0xgalz/Virtuailor tool for creating automatic C++ virtual tables in IDA Pro based on the runtime information.
DeClassifier: Class-Inheritance InferenceEngine for Optimized C++ Binaries 1901.10073.pdf
One more implementation: https://github.com/astrelsky/Ghidra-Cpp-Class-Analyzer
I guess I can tick, ASCII inheritance graph with this PR https://github.com/radareorg/radare2/pull/17362
See also these for extracting C++ classes information from kernelcache:
What should be done (in C, as part of radare2):
av
commands to view that dataSee code at
Examples of similar features in another programs:
Attached binary is malware written in C++, lots of vtables, not detected with av command. password is infected
banito.zip password: "infected" 2018-asplos.pdf