NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.12k stars 5.82k forks source link

Help with RTTI problem #4414

Closed Guusggg closed 2 years ago

Guusggg commented 2 years ago

Hi,

I don't think this is a bug in Ghidra but I must be making a mistake somewhere. I have a lot of strings like 11NisePVectorIN6ns_lib11NormalLayerEE, 12BattleVersus and N6ns_lib12SmartPointerI16CellAnimLiveDataEE in my program. This is a decompiled Nintendo DS ROM. These strings lead me to believe that there's Runtime Type Information in my binary. I'm almost convinced it was compiled with GCC, since they're RTTI strings are similar.

The problem is that none of the scripts that are in Ghidra will help me reconstruct these strings into types and it seems I have to do it all by hand. When I run a script like GccRttiAnalysisScript, it tells me it doesn't think it's a GCC binary. I also tried the Ghidra C++ Class and Run Time Type Information Analyzer plugin, which also does not run.

Is there any way to reconstruct this, without doing it by hand? I'm sure I'm making a mistake in my understanding of all this, can you point it out to me? Why does it not recognize it as GCC RTTI?

I was thinking I had to write my own script for this, but before I have to dive into the Ghidra Script API, I want to make sure I'm not making a simple mistake.

Some program info:

Language ID:    ARM:LE:32:v5t (1.103)
Compiler ID:    default
Processor:  ARM
Endian: Little
Address Size:   32
Analyzed:   true
Created With Ghidra Version:    10.1.4
Executable Format:  Raw Binary
Executable Location:    unknown

Thank you for your input.

astrelsky commented 2 years ago

Are any of the following strings present in the ROM? I'm already almost certain they will be considering the presence of those strings there.

"St9type_info", "N10cxxabiv117class_type_infoE", "N10cxxabiv120si_class_type_infoE", "N10cxxabiv121vmi_class_type_infoE"

If the above strings are in the ROM then it does indeed have RTTI and I'd be more than happy to help diagnose my plugin with you. As for the RTTI scripts provided with Ghidra it doesn't look like they search for the strings and only look for symbols which means it will only work if the program is not stripped.

@ghidra007 do the scripts attempt to backtrack from those strings?

Guusggg commented 2 years ago

St9type_info is matching exactly, N3abi17__class_type_infoE is a little different than yours along with N3abi20__si_class_type_infoE and N3abi21__vmi_class_type_infoE. They seem to differ by the string __cxx.

Thank you for the response. I've also looked in the source of the scripts and I can't find direct backtracking. I think they rely on the Analyzer to create the namespaces for them (if I interpret it correctly)? It looks like it's backtracking from existing symbols, not the strings themselves.

astrelsky commented 2 years ago

St9type_info is matching exactly, N3abi17__class_type_infoE is a little different than yours along with N3abi20__si_class_type_infoE and N3abi21__vmi_class_type_infoE. They seem to differ by the string __cxx.

That would be why then. It is possible that the abi isn't exactly the itanium abi (what gcc uses). If the data layout is the same then you could force it to work but I wouldn't know for sure without seeing the data. You can try patching those strings (quickest way) and you might get lucky. Alternatively you may be able to do it if you manually find the corresponding vtables for the base RTTI classes and define the correct symbols but that is more work and you'd need to manually backtrack.

Guusggg commented 2 years ago

Thank you so much for the response!

I'll try patching them first. I already found the corresponding vtables but I don't know how to make the scripts or your plugin understand them. Do you think those strings are the only one I should patch?

I'll close this for now since I have some new suggestions to work with.

ghidra007 commented 2 years ago

Are any of the following strings present in the ROM? I'm already almost certain they will be considering the presence of those strings there.

"St9type_info", "N10cxxabiv117class_type_infoE", "N10cxxabiv120si_class_type_infoE", "N10cxxabiv121vmi_class_type_infoE"

If the above strings are in the ROM then it does indeed have RTTI and I'd be more than happy to help diagnose my plugin with you. As for the RTTI scripts provided with Ghidra it doesn't look like they search for the strings and only look for symbols which means it will only work if the program is not stripped.

@ghidra007 do the scripts attempt to backtrack from those strings?

The script uses symbols and looks for ones in namespace "__cxxabiv1" with symbol names the special ones indicating the special vtables. If it doesn't find them that way then it looks for the mangled symbols. The strings he is looking at are mangled but don't seem to be symbols so that is one reason the script may not be finding them. It also sounds like various analyzers are not kicking off so there also may be an issue with Ghidra recognizing the compiler or other important info about the program which is why the demangler didn't work, for example. This is similar to what I was seeing with the mingw programs you shared. They are a mixed bag of Windows and gcc items so Ghidra doesn't know what to do with them. (I'm not suggesting this is mingw but possibly a mixed bag case or just a case where Ghidra doesn't recognize everything) I would be curious to know what happens once symbols are created for the special vtables and then running the RecoverClassesFromRTTI script. I'm guessing it won't get far if it isn't recognizing gcc. You might have to set the compiler type ahead of time then run analysis then run the script. If that still doesn't work then please paste the script output here so I know what it got hung up on. I definitely know the script wasn't tested on Nintendo DS ROMS. LOL.

ghidra007 commented 2 years ago

Also, if you are seeing the strings but no symbols then something isn't processing the string table that figures out where the corresponding symbols are and creates them. Are your strings in a long table? Is the import process splitting your program into appropriate memory blocks? NOTE: The symbols that correspond to the strings are not at the same address. Our importer usually figures out from the header information where they go and creates the symbols. It sounds like the format isn't completely understood so this is not happening in your case.

astrelsky commented 2 years ago

Also, if you are seeing the strings but no symbols then something isn't processing the string table that figures out where the corresponding symbols are and creates them. Are your strings in a long table? Is the import process splitting your program into appropriate memory blocks? NOTE: The symbols that correspond to the strings are not at the same address. Our importer usually figures out from the header information where they go and creates the symbols. It sounds like the format isn't completely understood so this is not happening in your case.

Static linked and stripped programs have no symbols and no imports. You can locate the special typeinfo classes and thus their vtables by backtracking from their typeinfo name value (the mangled symbol without _ZTI, _ZTS, _ZTV or _ZTT prefix) the special typeinfo class instances will be in the program just like the type_info::RTTI_Type_Descriptor is in windows programs built with vs.

ghidra007 commented 2 years ago

I assumed since it wasn't working in your plugin either and that it wasn't recognized as gcc rtti that that case wasn't working so assumed it was a more general Ghidra problem.

astrelsky commented 2 years ago

I assumed since it wasn't working in your plugin either and that it wasn't recognized as gcc rtti that that case wasn't working so assumed it was a more general Ghidra problem.

Oh :sweat_smile:

Guusggg commented 2 years ago

Thank you a lot for taking the time to write this out! Very much appreciated. I realize that Nintendo DS ROMS are definitely not a first target for tests, which is why I'm not too surprised it didn't work out... Thanks again!

ghidra007 commented 2 years ago

I'd like to look into the static stripped gcc case even if this other case was different which I'm not sure now if it was. Is one of your examples static and stripped gcc? If so, which one?

Guusggg commented 2 years ago

Hi @ghidra007, I'm extremely sorry as I want to provide more information but I have no clue which compiler was used. I do know that in my specific case, the Nintendo DS has a devkit which provides the compiler so I think it would be easy to find out. I'm looking into it as we speak, to try to provide information about which compiler was used. (I imagine the chance of the compiler being GCC very high, as it's not likely they created their own compiler)

EDIT: It's very likely that GCC was used to compile, but I'm not able to find an exact or even close version...

astrelsky commented 2 years ago

I'd like to look into the static stripped gcc case even if this other case was different which I'm not sure now if it was. Is one of your examples static and stripped gcc? If so, which one?

I did fix the samples in my tests. https://github.com/astrelsky/InheritanceTests/releases/download/4/InheritanceTests.tar.xz

x86_64/static has main and main_stripped. main is not stripped and has full dwarf debug information (-g3). I don't know what is going on here with Ghidra but I see a lot of error bookmarks due to calls to NULL that I do not recall ever seeing before. I did just confirm that the programs are valid and run correctly so I'm not sure what is going on there. You should be able to use these to test with though since my analysis still seems to work.

ghidra007 commented 2 years ago

Thanks! I was also able to generate one and also noticed calls to null which I had never seen before. Will try to get someone to look into that.

ghidra007 commented 2 years ago

Hi @ghidra007, I'm extremely sorry as I want to provide more information but I have no clue which compiler was used. I do know that in my specific case, the Nintendo DS has a devkit which provides the compiler so I think it would be easy to find out. I'm looking into it as we speak, to try to provide information about which compiler was used. (I imagine the chance of the compiler being GCC very high, as it's not likely they created their own compiler)

EDIT: It's very likely that GCC was used to compile, but I'm not able to find an exact or even close version...

Thanks! No worries.