smx-smx / xzre

XZ backdoor reverse engineering
https://smx-smx.github.io/xzre/
GNU General Public License v3.0
88 stars 6 forks source link

Sharing project files #3

Open ItzSwirlz opened 3 months ago

ItzSwirlz commented 3 months ago

Is it possible there could be a shared Ghidra/other RE tools project file(s) in the repo? This way as new information is found and more things are labeled, they can be labeled in other tools that make it easier to assess things.

smx-smx commented 3 months ago

I'm more in favour of making something that can be imported back into Ghidra/Ida, rather than putting the full project files on the repository (unless there's a good reason to do that).

For labelling symbols, we can make a Ghidra/Ida script that renames the obfuscated names by using xzre.lds as knowledge. For types, we can make a version of xzre.h that can be imported (in case it already isn't, due to external headers).

Do you think this could suffice?

An alternative could be to have some sort of human readable form of the project (like Ghidra XML export) to ease versioning, but at that point i'd rather go with the source+scripts route. The benefit of source+scripts is that modifications to the source code reflect back into the project, without having to maintain separate code and project files that need to be manually synced.

ItzSwirlz commented 3 months ago

Yeah, I think that'd be cool. Something that could at least be imported back into those tools would actually be better than the project files. Not sure what I was thinking with that. I think it's possible to parse a C header into Ghidra, but I don't know how that works. I've never tried it myself

smx-smx commented 2 months ago

656c2174f4a32be1ef4ac3c75c5f0f0cc3027750 adds a "slim header" that will be written to xzre.h in the build directory. It doesn't have external dependencies, so it should be loadable.

f84ef2ce9e725f9276363ba9dbddc56f9083f5f0 adds a CSV output in the form "name,section". However, it doesn't handle sections that have multiple entries, and in any case misses some kind of script to load it (if there is an existing format that could be used without writing a loader, let me know).

An option that crossed my mind was to replace it with something that pulls the names from the compiled xzre binary, since each renamed symbol will appear with 2 names per location, like this:

# readelf -sW xzre| grep 000000000000a180
    56: 000000000000a180   240 FUNC    LOCAL  HIDDEN    24 .Llzma2_encoder_init.1
   293: 000000000000a180     0 NOTYPE  GLOBAL DEFAULT   24 find_function

However, this wouldn't work for the few symbols that lack a name.

An alternative is to load the resulting xzre binary in Ghidra in place of the object file or the liblzma shared library. This however makes it more tedious to cross reference things between the mock binary and the real binary.

In other words, it needs to be worked on further.