NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.31k stars 5.84k forks source link

ObjC Methods of Classes are not recovered in iOS Binaries targeting iOS14 and above #3611

Closed fmagin closed 2 years ago

fmagin commented 2 years ago

Describe the bug

The ObjC2 Class analyzers fails to recover any method of a ObjC class if the MachO binary is built targetting iOS14 and above

To Reproduce

Expected behavior Regular recovery of the methods, i.e. all methods are added to the class and and an XRef is added from each of the relative Pointers to its respective target.

Attachments iOS15_example.zip

Environment (please complete the following information):

fmagin commented 2 years ago

Just started stepping through the code, it seems that the small method list should be handled already, so the problem is something different here. Edited the initial bug description to remove that.

fmagin commented 2 years ago

Ah, this is problem has two components:

Class Pointers use the lower 2 bits as flags

Part 1 is that in ghidra.app.util.bin.format.objc2.ObjectiveC2_Class#readData the pointer to the class_rw_t is read simply with index = ObjectiveC1_Utilities.readNextIndex(reader, _state.is32bit);. This value that is read into index isn't simply a pointer, but a pointer where the last two bits are flags, which must be cleared before treating this value as a pointer. In this binary, the class _TtC22test_sit_fraunhofer_de9TestClass has one of those bits set, and thus the pointer is wrong by 2 bytes and must be fixed with index -= index & 0b11. I don't have a proper development setup currently so I can only do this in the debugger right now, but when I do this I successfully get to the second part of the problem.

Pointer to Selector Name in small Method list is treated as absolute, but is relative to image base

In ghidra.app.util.bin.format.objc2.ObjectiveC2_Method#ObjectiveC2_Method, when isSmallList is set, the namePtr is read, but this is pointer relative to the base address. I.e. 0x7C2A is read, but this needs to be 0x100007c2a. When this is fixed, the analysis successfully finishes. And the function is added to the class as expected.

Leftover issues/artifacts

This does successfully add the method to the class, but various xrefs and addresses are not correctly modelled/displayed afterwards:

fmagin commented 2 years ago

@ryanmkurtz IMO this should also make it into the 10.1 release as this relates to the other MachO/iOS15 changes and AFAIK this isn't addressed in the 10.1-BETA yet. Is this on your radar yet?

ryanmkurtz commented 2 years ago

It's next on my list!

ryanmkurtz commented 2 years ago

@fmagin I have implemented the fixes for the first 2 main issues. Where did you learn that the lower 2 bits are flags? The header file?

fmagin commented 2 years ago

The one source I remember is that another library that parses those data structures had it as a comment. Large (all?) parts of the ObjC runtime are open source though, and I vaguely recall there being some methods that check those flags, but I'm currently not on the computer with the IDE setup for that code

fmagin commented 2 years ago

After looking into objc-runtime-new.h more carefully, I think that the lower 3 bits are actually flags: The fact that the class_rw_t pointer is combined with flags is "documented" here

and when looking at the definition of class_data_bits_t there are methods for splitting this into the actual class_rw_t* and the information that is encoded in the flags

    class_rw_t* data() const {
        return (class_rw_t *)(bits & FAST_DATA_MASK);

    bool isAnySwift() {
        return isSwiftStable() || isSwiftLegacy();
    }

    bool isSwiftStable() {
        return getBit(FAST_IS_SWIFT_STABLE);
    }

    bool isSwiftLegacy() {
        return getBit(FAST_IS_SWIFT_LEGACY);

The third flag seems to be FAST_HAS_DEFAULT_RR used here

All the flags are defined near the top of the header file

// class is a Swift class from the pre-stable Swift ABI
#define FAST_IS_SWIFT_LEGACY    (1UL<<0)
// class is a Swift class from the stable Swift ABI
#define FAST_IS_SWIFT_STABLE    (1UL<<1)
// class or superclass has default retain/release/autorelease/retainCount/
//   _tryRetain/_isDeallocating/retainWeakReference/allowsWeakReference
#define FAST_HAS_DEFAULT_RR     (1UL<<2)
// data pointer
#define FAST_DATA_MASK          0x00007ffffffffff8UL
ryanmkurtz commented 2 years ago

Thanks, i came across this too and had the same thoughts but wasn't 100% sure...I'm glad you have the same opinion on it. I'll mask off 3 bits if it's 64-bit and 2 bits if it's 32-bit.

ryanmkurtz commented 2 years ago

I am an Objective-C noob, but I'm starting to think patching our existing file isn't the right way to go in the long run. We should probably have a whole new set of files that parse objc4 (and 3?). Hopefully a lot of these changes are backwards compatible. Regardless, I'm putting these fixes into our existing files.

fmagin commented 2 years ago

The question is IMO what "in the long run" means. ObjC is basically being replaced by Swift as far as I understand, so I don't expect that there will be many changes to the ObjC data structures in the long run. I have been dealing with various ObjC MachO aarch64 Binaries (i.e. iOS Apps) in the last 2 years, and so far Ghidra has worked remarkably well overall. This issue here is basically the first case where things really broke in an obvious way.

ryanmkurtz commented 2 years ago

Ok, that's good to hear! I haven't worked in these files much before so I don't know the history of what's changed over the years.

BTW, the fixes I am putting in will address only the 2 main issues you found. The Leftover issues/artifacts will be addressed later with some changes we are putting in to make pointers more flexible. We will use your attached binary as an example for testing that when the time comes.