gimli-rs / object

A unified interface for reading and writing object file formats
https://docs.rs/object/
Apache License 2.0
673 stars 156 forks source link

Fail to parse specific section for relocations #728

Closed Evian-Zhang closed 2 months ago

Evian-Zhang commented 2 months ago

For ELF binaries, we can use Object::dynamic_relocations or ObjectSection::relocations to get Relocation type. However, as Object::dynamic_relocations are meant to collect all relocations that appeared in this binary, Object::Section::relocations does not designed to treat specific section's data as relocations entries. Instead, it uses RelocationSections::Get in this line to find a corresponding relocation section for given section, and parse such section for relocation entries.

However, in some binaries, the RelocationSections is empty, while we can still get relocation entries through Object::dynamic_relocations. In such binaries, there is no way to get relocation entries of specific sections.

Why I want this feature

I want to get name of plt entries in an ELF binary, and following discussions in gimli-rs/object#227, I found that the ddbug uses disassembly to get the corresponding got entry of one PLT stub. However, as stated in this SO, objdump assumes that PLT stubs and GOT entries are increasing coordinately. As a result, we can only parse the got relocation entries and then use the offset to locate corresponding PLT stubs, which is much light-weight than the disassemble-approach.

philipc commented 2 months ago

See if #729 meets your need. Note that even without that PR, I think you could still call Object::dynamic_relocations and compare the relocation address to the section you are interested in. This is possibly more reliable in general, since ObjectSection::dynamic_relocations requires sh_info to be set, and I haven't checked if all linkers do that (it may not be required because the dynamic loader doesn't use the sections).

Evian-Zhang commented 2 months ago

Thank you for your advice. #729 does solve my problem:)

Evian-Zhang commented 2 months ago

Sorry for reopening this issue. I found that the solution could not solve my problem after some investigation. The GOT entries in .got section are relocated by relocation records in several sections, such as .rela.plt and .rela.dyn. However, only .rela.plt relocates the GOT entry that PLT stub references. So I need to determine whether a relocation is generated by .rela.plt, which I cannot come up with a solution.

philipc commented 2 months ago

You're doing something that is very specific to ELF, so use the lower level ELF API to read the relocations in .rela.plt.

philipc commented 2 months ago

I guess another option would be to check the relocation type (such as R_X86_64_JUMP_SLOT).

philipc commented 2 months ago

However, only .rela.plt relocates the GOT entry that PLT stub references.

Looking some more, I don't think that's true. .rela.plt contains the relocations for .plt, but there are also PLT entries in .plt.got, and the relocations for those are mixed in with others in .rela.dyn. So you'll have to handle those anyway.

Evian-Zhang commented 2 months ago

OK, so I think this is a very specific question that isn't related to object crate. Thank you for your advice. I will investigate more.

Evian-Zhang commented 2 months ago

For anyone interested in this question, the following is my implementation of retrieving PLT stub symbols (do not rely on it for robustness).

extend_plt_symbols_for_elf ```rust fn extend_plt_symbols_for_elf< 'data, Elf: FileHeader, R: object::ReadRef<'data>, const ELF_PLT_STUB_SIZE: usize, >( symbol_map: &mut HashMap, elf: &ElfFile<'data, Elf, R>, ) -> Result<()> { let Some(plt_sec) = elf.section_by_name(".plt") else { return Ok(()); }; // First one is not PLT stub let plt_start = plt_sec.address() + ELF_PLT_STUB_SIZE as u64; let Some(plt_size) = (plt_sec.size() as usize).checked_sub(ELF_PLT_STUB_SIZE) else { return Ok(()); }; let plt_count = plt_size / ELF_PLT_STUB_SIZE; let Some(dyn_syms) = elf.dynamic_symbol_table() else { return Ok(()); }; if let Some(dyn_plt_sec) = elf.section_by_name(".rela.plt") { let relas = dyn_plt_sec .elf_section_header() .data_as_array::(elf.endian(), elf.data())?; for (index, rela) in relas.into_iter().enumerate() { if index >= plt_count { break; } let Some(symbol_index) = rela.symbol(elf.endian(), false) else { continue; }; let dyn_sym = dyn_syms.symbol_by_index(symbol_index)?; let Ok(symbol_name) = dyn_sym.name() else { continue; }; let plt_stub_address = plt_start + (index * ELF_PLT_STUB_SIZE) as u64; symbol_map.insert(plt_stub_address, symbol_name.to_string()); } return Ok(()); } if let Some(dyn_plt_sec) = elf.section_by_name(".rel.plt") { let rels = dyn_plt_sec .elf_section_header() .data_as_array::(elf.endian(), elf.data())?; for (index, rel) in rels.into_iter().enumerate() { if index >= plt_count { break; } let Some(symbol_index) = rel.symbol(elf.endian()) else { continue; }; let dyn_sym = dyn_syms.symbol_by_index(symbol_index)?; let Ok(symbol_name) = dyn_sym.name() else { continue; }; let plt_stub_address = plt_start + (index * ELF_PLT_STUB_SIZE) as u64; symbol_map.insert(plt_stub_address, format!("{symbol_name}@PLT")); } return Ok(()); } Ok(()) } ```

This is roughly corresponds to objdump's general implementation in this function. However, there is a more specific function to deal with PLT relocation symbols for x86_64, which is in this function. This solution needs disassembly and is able to handle more PLT categories such as .plt.sec which is much more common in modern libc (even llvm-objdump cannot deal with it for now), but this solution is too complex to implement by myself.