Open RyanGlScott opened 1 year ago
Reading the ELF spec for DT_RELA
/DT_RELASZ
, it seems like the interpretation in the 32-bit binary is the expected one? As in, RELASZ
should be the sum of sizes of all entries (possibly from multiple sections) that participate in creating the relocation table.
But indeed, if there is a decent way of knowing the number of entries / the relevant sections, we may be able to fix this on the fly, though that's a bit annoying.
Some searching (see here) suggests that the range of memory [DT_RELA, DT_RELA + DT_RELASZ)
and the range [DT_JMPREL, DT_JMPREL + DT_PLTRELSZ)
are allowed to overlap, but that isn't guaranteed to be the case. (DT_RELA
/DT_RELASZ
correspond to the .rela.dyn
section, and DT_JMPREL
/DT_PLTRELSZ
correspond to the .rela.plt
section.)
if there is a decent way of knowing the number of entries / the relevant sections
Annoyingly, I'm not sure how you'd even know the number of entries without reading the contents of the table first, which is the very thing that I'm trying to do. I'd have to take a look at how readelf
itself does this. If I had to guess, it is likely using the section boundaries of known .rela
sections to compute the actual size of the relocation table, but perhaps there is a more direct way to accomplish this.
Reading the glibc
source code reveals a way forward:
/* On some machines, notably SPARC, DT_REL* includes DT_JMPREL in its
range. Note that according to the ELF spec, this is completely legal!
We are guaranteed that we have one of three situations. Either DT_JMPREL
comes immediately after DT_REL*, or there is overlap and DT_JMPREL
consumes precisely the very end of the DT_REL*, or DT_JMPREL and DT_REL*
are completely separate and there is a gap between them. */
That is to say, the implementation of dynRelaBuffer
must proceed by cases:
DT_RELA + DT_RELASZ == DT_JMPREL
, then DT_JMPREL
comes immediately after DT_RELA
. Read the range [DT_RELA, DT_JMPREL + DT_PLTRELASZ)
.DT_RELA + DT_RELASZ == DT_JMPREL + DT_PLTRELASZ
, then the ranges overlap. Read the range [DT_RELA, DT_RELA + DT_RELASZ)
.[DT_RELA, DT_RELA + DT_RELASZ)
and [DT_JMPREL, DT_JMPREL + DT_PLTRELSZ)
separately, then combine them.We would need to do something similar in dynRelBuffer
as well (but replace "RELA
" with "REL
").
We should check to see what the behavior of macaw
's pltStubSymbols
function is after such a change. Currently, it computes PLT stubs returned by dynRelaEntries
(which is defined in terms of dynRelaBuffer
) and dynPLTRel
(which looks in the range [DT_JMPREL, DT_JMPREL + DT_PLTRELSZ)
). The code makes the assumption is that these two regions of memory will be disjoint, but because of the overlapping behavior observed above, this is not necessarily the case.
My guess is that pltStubSymbols
will continue to work either way because it is computing a Map
, and inserting duplicate relocation addresses into the Map
isn't observably different from inserting them without duplicates. Nevertheless, we should avoid needlessly inserting duplicate entries if we can, if for no other reason than efficiency.
See https://github.com/GaloisInc/macaw/issues/416 for a similar issue on the macaw
side.
While adding a test case for #35, I discovered that
elf-edit
'sdecodeRelaEntries
function will return different results for PPC32 binaries versus PPC64 binaries. To pick a concrete example, let's look at this simple C program:I used a
musl
-based PPC32 cross-compiler (obtained from here) and a PPC64 cross-compiler (obtained from here) to compile this program into a PPC32 binary namedppc32-relocs.elf
and a PPC64 binary namedppc64-relocs.elf
, respectively. I can usereadelf -r
to determine the relocations contained in each binary's RELA relocation table:Note that each binaries' RELA relocation table is divided into two sections,
.rela.dyn
and.rela.plt
. This will be important later.If I use
elf-edit
'sdecodeRelaEntries
function onppc32-relocs.elf
, it will return all of the relocations thatreadelf -r
reports. On the other hand, if I usedecodeRelaEntries
onppc64-relocs.elf
, it will only report a subset of the relocations:Notably, all of the
R_PPC64_JMP_SLOT
relocations (contained exclusively within the.rela.plt
section) are absent!The reason this happens is because each binary prescribes different semantics to the
RELASZ
tag. For example, let's look out thereadelf -d ppc32-relocs.elf
tag:elf-edit
determines what part of the binary corresponds to the RELA relocation table by:RELA
(0x2ec
), andRELASZ
(252
)In the
ppc32-relocs.txt
example, this works beautifully. There are 17 entries in the.rela.dyn
section and 4 entries in the.rela.plt
section for a total of 21 entries overall in the relocation table.RELAENT
tells us that each entry is 12 bytes in size, and 21 * 12 = 252, which is exactly the value ofRELASZ
.Things get stranger with
ppc64-relocs.elf
, however:Here, we have a
RELASZ
of 192 bytes. There are 8 entries in the.rela.dyn
section and 4 entries in the.rela.plt
section for a total of 12 entries overall in the relocation table. Moreover,RELAENT
is 24 bytes. But note that 12 * 24 = 288, which exceeds the value ofRELASZ
! In this particular example,RELASZ
only covers the size of the.rela.dyn
section, and it does not cover anything in the.rela.plt
section, which explains why all of the relocations from the.rela.plt
section were omitted. (If you addRELASZ
withPLTRELSZ
, the latter being the size of the.rela.plt
section, then you do in fact get 288 bytes.)What should we do here? The cross-compilers I am using for PPC32 and PPC64 appear to prescribe different semantics to the
RELASZ
tag, which makes it questionable whether that is a reliable way to gauge the overall size of the RELA relocation table. Perhaps we should instead count the number of table entries and multiply it byRELAENT
?