Closed fmagin closed 2 years ago
Just started stepping through the code, it seems that the small method list should be handled already, so the problem is something different here. Edited the initial bug description to remove that.
Ah, this is problem has two components:
Part 1 is that in ghidra.app.util.bin.format.objc2.ObjectiveC2_Class#readData
the pointer to the class_rw_t
is read simply with index = ObjectiveC1_Utilities.readNextIndex(reader, _state.is32bit);
. This value that is read into index
isn't simply a pointer, but a pointer where the last two bits are flags, which must be cleared before treating this value as a pointer. In this binary, the class _TtC22test_sit_fraunhofer_de9TestClass
has one of those bits set, and thus the pointer is wrong by 2 bytes and must be fixed with index -= index & 0b11
. I don't have a proper development setup currently so I can only do this in the debugger right now, but when I do this I successfully get to the second part of the problem.
In ghidra.app.util.bin.format.objc2.ObjectiveC2_Method#ObjectiveC2_Method
, when isSmallList
is set, the namePtr
is read, but this is pointer relative to the base address. I.e. 0x7C2A
is read, but this needs to be 0x100007c2a
. When this is fixed, the analysis successfully finishes. And the function is added to the class as expected.
This does successfully add the method to the class, but various xrefs and addresses are not correctly modelled/displayed afterwards:
10000c108
the value is treated as DAT_10000c04a
, which is the literal value there, but this is the pointer where the last 2 bits are flags, and in reality this effectively points to the class_rw_t
struct at 0x10000c048
method_list_t
only indicates that this is an offset from this address, but doesn't seem to actually link it:
class_rw_t
object at 10000c048
doesn't seem to get properly displayed, possibly because 10000c04a
already is defined as some kind of data, so the struct can't be applied?
@ryanmkurtz IMO this should also make it into the 10.1 release as this relates to the other MachO/iOS15 changes and AFAIK this isn't addressed in the 10.1-BETA yet. Is this on your radar yet?
It's next on my list!
@fmagin I have implemented the fixes for the first 2 main issues. Where did you learn that the lower 2 bits are flags? The header file?
The one source I remember is that another library that parses those data structures had it as a comment. Large (all?) parts of the ObjC runtime are open source though, and I vaguely recall there being some methods that check those flags, but I'm currently not on the computer with the IDE setup for that code
After looking into objc-runtime-new.h
more carefully, I think that the lower 3 bits are actually flags:
The fact that the class_rw_t
pointer is combined with flags is "documented" here
and when looking at the definition of class_data_bits_t
there are methods for splitting this into the actual class_rw_t*
and the information that is encoded in the flags
class_rw_t* data() const {
return (class_rw_t *)(bits & FAST_DATA_MASK);
bool isAnySwift() {
return isSwiftStable() || isSwiftLegacy();
}
bool isSwiftStable() {
return getBit(FAST_IS_SWIFT_STABLE);
}
bool isSwiftLegacy() {
return getBit(FAST_IS_SWIFT_LEGACY);
The third flag seems to be FAST_HAS_DEFAULT_RR
used here
All the flags are defined near the top of the header file
// class is a Swift class from the pre-stable Swift ABI
#define FAST_IS_SWIFT_LEGACY (1UL<<0)
// class is a Swift class from the stable Swift ABI
#define FAST_IS_SWIFT_STABLE (1UL<<1)
// class or superclass has default retain/release/autorelease/retainCount/
// _tryRetain/_isDeallocating/retainWeakReference/allowsWeakReference
#define FAST_HAS_DEFAULT_RR (1UL<<2)
// data pointer
#define FAST_DATA_MASK 0x00007ffffffffff8UL
Thanks, i came across this too and had the same thoughts but wasn't 100% sure...I'm glad you have the same opinion on it. I'll mask off 3 bits if it's 64-bit and 2 bits if it's 32-bit.
I am an Objective-C noob, but I'm starting to think patching our existing file isn't the right way to go in the long run. We should probably have a whole new set of files that parse objc4 (and 3?). Hopefully a lot of these changes are backwards compatible. Regardless, I'm putting these fixes into our existing files.
The question is IMO what "in the long run" means. ObjC is basically being replaced by Swift as far as I understand, so I don't expect that there will be many changes to the ObjC data structures in the long run. I have been dealing with various ObjC MachO aarch64 Binaries (i.e. iOS Apps) in the last 2 years, and so far Ghidra has worked remarkably well overall. This issue here is basically the first case where things really broke in an obvious way.
Ok, that's good to hear! I haven't worked in these files much before so I don't know the history of what's changed over the years.
BTW, the fixes I am putting in will address only the 2 main issues you found. The Leftover issues/artifacts will be addressed later with some changes we are putting in to make pointers more flexible. We will use your attached binary as an example for testing that when the time comes.
Describe the bug
The ObjC2 Class analyzers fails to recover any method of a ObjC class if the MachO binary is built targetting iOS14 and above
To Reproduce
class_t
can be found by the analysis_TtC22test_sit_fraunhofer_de9TestClass
is created as a symbol, but has no associated methodsExpected behavior Regular recovery of the methods, i.e. all methods are added to the class and and an XRef is added from each of the relative Pointers to its respective target.
Attachments iOS15_example.zip
Environment (please complete the following information):