NationalSecurityAgency / ghidra

Ghidra is a software reverse engineering (SRE) framework
https://www.nsa.gov/ghidra
Apache License 2.0
51.12k stars 5.82k forks source link

ARM Type 0x2 Relocation References off by 8 bytes #2261

Closed astrelsky closed 4 years ago

astrelsky commented 4 years ago

Describe the bug The references produces by data to a type 0x2 relocation are off by +8 bytes on ARM.

Screenshots Capture

Environment (please complete the following information):

Additional context The reference to void::type_info should really be to __si_class_type_info. I've never seen so may references to a fundamental type_info before. I was baffled at seeing 649 references to it. The address of the external void::type_info linkage is 0x03ea725c and the address for _ZTVN10__cxxabiv120__si_class_type_infoE is 0x03ea7254

mumbel commented 4 years ago

R_ARM_ABS32 has a comment of /** (S + A) | T */, but has

int oldValue = memory.getInt(relocationAddress);
newValue = (int) (symbolValue + addend + oldValue);
if (isThumb) {
    newValue |= 1;
}

looks as simple as removing oldValue usage

astrelsky commented 4 years ago

R_ARM_ABS32 has a comment of /** (S + A) | T */, but has

int oldValue = memory.getInt(relocationAddress);
newValue = (int) (symbolValue + addend + oldValue);
if (isThumb) {
  newValue |= 1;
}

looks as simple as removing oldValue usage

I'll give this a shot tonight. Hopefully that's all it is.

That appears to have been it.

ghidra1 commented 4 years ago

Is the relocation in question a REL or RELA ?

astrelsky commented 4 years ago

Is the relocation in question a REL or RELA ?

I don't know. I have confirmed that the linked pull request corrected the problem.

ghidra1 commented 4 years ago

@astrelsky: if the case in question is a RELA and the relocation address contained non-zero data your fix would correct this. I have seen similar situations with bogus data at the relocation address for a RELA. Good practice is to obtain the addend in one of two ways if both REL and RELA are supported - not assume that the existing data will be 0 for the RELA case. I personally prefer to avoid the technique employed by the ARM relocation handler which makes the bad assumption that oldValue will be 0 for RELA cases. Use of the method elfRelocationContext.extractAddend() can help with this choice.

mumbel commented 4 years ago

Are you thinking it would be better if long addend = relocation.getAddend(); // will be 0 for REL case be replaced with maybe something like: elfRelocationContext.extractAddend() ? relocation.getAddend() : 0 or do think relocation by relocation check if the REL/RELA situation?

ghidra1 commented 4 years ago

I will probably fix the R_ARM_ABS32 relocation within my current branch adopting this strategy. Although I will check binutils before doing so.

astrelsky commented 4 years ago

I will close the pr then. I don't know much about relocations and I struggle to interpret the binutils HOWTO macro. I've tried implementing support for custom relocations in a relocation handler before but to no avail.

ghidra1 commented 4 years ago

The relocation will always return 0 for the REL case. For the REL case, the relocation handler must handle addend extraction from the relocation address on a case-by-case basis since size and sign-extension rules may vary.

ghidra1 commented 4 years ago

Generically speaking, a relocation type may support REL, RELA or both. I have found determining which of these three cases applies to a given relocation type can be difficult, and the special addend extraction rules which may apply to the REL case.

ghidra1 commented 4 years ago

@astrelsky, although a moot point, if you try the PR you just closed with the object module mentioned in #2276 which contains REL type R_ARM_ABS32 relocations, these rely on the oldValue as an addend to produce the correct result.

ghidra1 commented 4 years ago

Although ticket closed with my commit, could you please verify that the current master works as expected

astrelsky commented 4 years ago

@ghidra1 unfortunately it is not correct. It is still off by +8.

ghidra1 commented 4 years ago

Can you verify that it is using a REL relocation. The relocation sections get marked-up which should help you identify the relocation record. This may be a form of pre-linking which may need to be handled special.

ghidra1 commented 4 years ago

Unfortunately, I may need the binary to properly diagnose.

astrelsky commented 4 years ago

Can you verify that it is using a REL relocation. The relocation sections get marked-up which should help you identify the relocation record. This may be a form of pre-linking which may need to be handled special.

It is indeed Elf32_Rel.

Unfortunately, I may need the binary to properly diagnose.

I don't own the rights to this so I cannot do so. However, this should be reproducible by compiling something with rtti using the android ndk. I'll see if I can whip something up.

astrelsky commented 4 years ago

@ghidra1 I didn't need to use the NDK. I actually had something else on hand which is mine to reproduce it. At address 0x00035bdc the fixed-up data points to 0x00037104 but it should point to 0x000370fc

main.zip

ghidra1 commented 4 years ago

In your main sample I don't think it is a relocation issue. I believe the issue here is that we do not handle external data very well (i.e., symbol placed in fabricated EXTERNAL block). The EXTERNAL block work pretty well to facilitate GOT/PLT thunks to external functions but not not handle sized data at all. Allocation of symbols within the EXTERNAL block do not factor in size or datatype. While it the hope was to improve EXTERNAL data, it is quite problematic.

astrelsky commented 4 years ago

In your main sample I don't think it is a relocation issue. I believe the issue here is that we do not handle external data very well (i.e., symbol placed in fabricated EXTERNAL block). The EXTERNAL block work pretty well to facilitate GOT/PLT thunks to external functions but not not handle sized data at all. Allocation of symbols within the EXTERNAL block do not factor in size or datatype. While it the hope was to improve EXTERNAL data, it is quite problematic.

Handling external data is quite the problem. I've been inserting the external data into a project archive in order to handle the missing information. It was the only fool-proof way I could think of to handle it.

Anyway, the actual missing data itself isn't an issue and I've improved my analysis to disregard the "mis-pointing" pointer. I do think it would be beneficial for it to at least point to the correct EXTERNAL symbol. It would allow a user who is manually analyzing data to know what it is pointing to instead of being mislead.

ghidra1 commented 4 years ago

I believe the relocated pointer it is correctly offset by 8-bytes from the relocation symbol into the associated data structure (which is absent). This likely corresponds to a structure member and not the head of the structure. Unfortunately, with EXTERNAL data improperly sized and allocated, the relocated pointer references a completely unrelated symbol in the EXTERNAL block. Symbols in the EXTERNAL block are allocated back-to-back as they are encountered during the import. For ARM these EXTERNAL symbols are spaced arbitrarily 4-bytes apart. I have examined code for quite a few linkers/loader implementations for treatment of R_ARM_ABS32 and they all add the symbol value/offset to the data stored at the relocation address.

ghidra1 commented 4 years ago

Since the EXTERNAL data symbol size is not specified by the corresponding ELF Symbol entry there is no way to know how much space to allocate for it at the time of import. It is the dynamic linker's job to relocate using the correct symbol address within the process memory space. As mentioned previously, we have yet to approximate this behavior.

ghidra1 commented 4 years ago

At a minimum, I could change the code to detect this situation and place a bad bookmark at the relocation address.

astrelsky commented 4 years ago

At a minimum, I could change the code to detect this situation and place a bad bookmark at the relocation address.

What is creating the symbol for the external data? Wouldn't it be preferred to create the external location when processing the relocation that needs the symbol? That would assure it is put in the right place. Of course if all else fails then a bookmark should suffice.

ghidra1 commented 4 years ago

The EXTERNAL location is created during symbol processing (see ElfProgramBuilder.processSymbols). Relocation processing uses the symbol address established during this processing. Even if we were to delay until relocation processing I don't think we would have much more information to help.

I also noticed the symbols have a ELF Symbol Type (7) I am not familiar with. I have seen Android source define STT_NUM as both 5 and 7 in different code bases, although STT_COMMON generally corresponds to 5. I don't know if this symbol type (7) would impact relocation processing.

astrelsky commented 4 years ago

The EXTERNAL location is created during symbol processing (see ElfProgramBuilder.processSymbols). Relocation processing uses the symbol address established during this processing. Even if we were to delay until relocation processing I don't think we would have much more information to help.

I also noticed the symbols have a ELF Symbol Type (7) I am not familiar with. I have seen Android source define STT_NUM as both 5 and 7 in different code bases, although STT_COMMON generally corresponds to 5. I don't know if this symbol type (7) would impact relocation processing.

The program I sent was compiled with arm-linux-gnueabi-g++ so I don't think there should be anything special going on. This problem is only arm specific but I can double check. I do build that program for various other architectures and only recall seeing it on arm. This is the source for that program https://github.com/astrelsky/InheritanceTests (maybe outdated).

astrelsky commented 4 years ago

Actually on second thought I think it does this on every architecture and I just never looked into it. Probably forgot while debugging which is why I reported this.

ghidra1 commented 4 years ago

The +8 offset for each of the typeinfo structures appears to be consistent with this description found here:

`typeinfo for'BaseClass
 dd offset `vtable for'__cxxabiv1::__si_class_type_info+8
 dd offset `typeinfo name for'BaseClass
ghidra1 commented 4 years ago

I am rather stumped and have been for a long time how best to handle external data symbols and related pointers such as the one in question here. In any case, the relocation was processed properly, it is just that the data symbols in the EXTERNAL block are pretty useless.

astrelsky commented 4 years ago

The +8 offset for each of the typeinfo structures appears to be consistent with this description found here:

`typeinfo for'BaseClass
 dd offset `vtable for'__cxxabiv1::__si_class_type_info+8
 dd offset `typeinfo name for'BaseClass

That +8 is the offset from the top of __class_type_info::vtable to the start of the virtual function table. That is where it would point to if it was linked statically. I may have just been confused because the symbol name for the relocation is for the mangled __vmi_class_type_info::vtable which would be at the top of the vbtable instead of the vftable.

Either way I'm saving that pdf. I was never able to find solid documentation on the gcc rtti and just figured it out as I went along.

ghidra1 commented 4 years ago

Not sure where you to take this ticket at this point. This is a know limitation of the EXTERNAL block and the handling of external data.

astrelsky commented 4 years ago

Not sure where you to take this ticket at this point. This is a know limitation of the EXTERNAL block and the handling of external data.

That's fine. This can just be disregarded as I just misinterpreted what was occurring.

ghidra1 commented 3 years ago

I have made a change for Ghidra 10.0 (commit 00ba983a42090806c9f1617de42fe77eea675c88) which slightly changes how EXTERNAL block data relocations are applied and places error bookmarks to flag such invalid fixups. When the condition is detected the addend is not applied which will alter the referenced location, albeit still wrong. The hope is to avoid referencing an irrelevant external symbol and draw attention to the condition.

astrelsky commented 3 years ago

I have made a change for Ghidra 10.0 (commit 00ba983a42090806c9f1617de42fe77eea675c88) which slightly changes how EXTERNAL block data relocations are applied and places error bookmarks to flag such invalid fixups. When the condition is detected the addend is not applied which will alter the referenced location, albeit still wrong. The hope is to avoid referencing an irrelevant external symbol and draw attention to the condition.

This makes a bit more sense to do. Would it be possible to somehow show the referenced location to be "address + addend" instead of showing the symbol there? It would be leaning a bit closer to correct. Double clicking it would still yield the wrong location but it may help further grasp what the target should be. Maybe a special ExternalOffcutReference is in order?

ghidra1 commented 3 years ago

I considered an Memory Offset Reference (which gets very little use), however that would have the same net-affect as it was previously plus the importer does not attempt to characterize the fixed-up location with such markup. The bookmark does indicate the + which could be transposed to a post-comment. The primary issue is there is not a valid set of bytes we can lay-down at the fixup location given the manner in which the data symbol is defined in the EXTERNAL block.

The change has been pushed out to GitHub if you would like to try it.