chenpengcong / blog

14 stars 3 forks source link

重定位 #12

Open chenpengcong opened 6 years ago

chenpengcong commented 6 years ago

当链接器确定了每个section,以及每个符号的运行时地址后(符号地址的确定可以参考上篇文章),需要修改代码段和数据段中对每个符号的引用,使得它们指向正确的运行时地址。

比如有以下代码

//a.c
extern int global_var;

void foo();
int main()
{
    int a = global_var;
    foo();
    return 0;
}

//b.c
int global_var = 1;

void foo()
{
}

进行编译,链接: $ gcc -c a.c b.c && ld a.o b.o -e main -o ab

接下来开始分析:

首先反汇编a.o: $ $ objdump -d a.o

a.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <main>:
   0:    55     push %rbp
   1:    48 89 e5     mov %rsp,%rbp
   4:    48 83 ec 10     sub $0x10,%rsp
   8:    8b 05 00 00 00 00     mov 0x0(%rip),%eax # e <main+0xe>
   e:    89 45 fc     mov %eax,-0x4(%rbp)
  11:    b8 00 00 00 00     mov $0x0,%eax
  16:    e8 00 00 00 00     callq 1b <main+0x1b>
  1b:    b8 00 00 00 00     mov $0x0,%eax
  20:    c9     leaveq 
  21:    c3     retq 

地址0x08处的指令: 8: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # e <main+0xe> 表示将0x00 + R[rip]即(0x00 + 0x0e)复制到寄存器%eax,寄存器%eax用来存放变量a的值,由于全局变量定义在其他目标文件中,此时编译器并不知道global_var的地址,所以编译器把寄存器rip的值(下一条命令的地址)看作是全局变量global_var的地址。

继续看地址0x16处的指令: 16: e8 00 00 00 00 callq 1b <main+0x1b> 表示一个过程调用,过程调用的起始地址为0x01b,该命令就是调用函数foo,与上述global_var的地址类似,编译器将函数foo的起始地址设置为该指令的下一条指令地址

编译器将这两条指令的地址暂时用0x0e和0x1b代替着,把真正的地址计算工作留给了链接器。

而链接器通过重定位表找到需要被修改的指令并进行调整

查看a.o的重定位表: readelf -r a.o

Relocation section '.rela.text' at offset 0x210 contains 2 entries:
  Offset Info Type Sym. Value Sym. Name + Addend
00000000000a 000900000002 R_X86_64_PC32 0000000000000000 global_var - 4
000000000017 000b00000004 R_X86_64_PLT32 0000000000000000 foo - 4

可以看到有两个重定位入口(每个要被重定位地方称为一个重定位入口)

offset是重定位入口偏移:这个值是该重定位入口所要修正的位置的第一个字节相对于段起始的偏移。

在a.o中0x0a和0x17分别是代码段中mov指令和call指令的地址部分

type是指重定位类型,用来告知链接器如何修改重定位入口

重定位类型及其重定位修正方法如下图


A Represents the addend used to compute the value of the relocatable field. B Represents the base address at which a shared object has been loaded into memory during execution. Generally, a shared object is built with a 0 base virtual address, but the execution address will be different. G Represents the offset into the global offset table at which the relocation entry’s symbol will reside during execution. GOT Represents the address of the global offset table. L Represents the place (section offset or address) of the Procedure Linkage Table entry for a symbol. P Represents the place (section offset or address) of the storage unit being relocated (computed using r_offset). S Represents the value of the symbol whose index resides in the relocation entry. Z Represents the size of the symbol whose index resides in the relocation entry.

Name Value Field Calculation
R_X86_64_NONE 0 none none
R_X86_64_64 1 word64 S+A
R_X86_64_PC32 2 word32 S+A-P
R_X86_64_GOT32 3 word32 G+A
R_X86_64_PLT32 4 word32 L+A-P
R_X86_64_COPY 5 none none
R_X86_64_GLOB_DAT 6 word64 S
R_X86_64_JUMP_SLOT 7 word64 S
R_X86_64_RELATIVE 8 word64 B+A
R_X86_64_GOTPCREL 9 word32 G+GOT+A-P
R_X86_64_32 10 word32 S+A
R_X86_64_32S 11 word32 S+A
R_X86_64_16 12 word16 S+A
R_X86_64_PC16 13 word16 S+A-P
R_X86_64_8 14 word8 S+A
R_X86_64_PC8 15 word8 S+A-P
R_X86_64_DTPMOD64 16 word64
R_X86_64_DTPOFF64 17 word64
R_X86_64_TPOFF64 18 word64
R_X86_64_TLSGD 19 word32
R_X86_64_TLSLD 20 word32
R_X86_64_DTPOFF32 21 word32
R_X86_64_GOTTPOFF 22 word32
R_X86_64_TPOFF32 23 word32
R_X86_64_PC64 24 word64 S+A-P
R_X86_64_GOTOFF64 25 word64 S+A-GOT
R_X86_64_GOTPC32 26 word32 GOT+A-P
R_X86_64_GOT64 27 word64 G + A
R_X86_64_GOTPCREL64 28 word64 G + GOT - P + A
R_X86_64_GOTPC64 29 word64 GOT - P + A
R_X86_64_GOTPLT64 30 word64 G + A
R_X86_64_PLTOFF64 31 word64 L - GOT + A
R_X86_64_SIZE32 32 word32 Z+A
R_X86_64_SIZE64 33 word64 Z+A
R_X86_64_GOTPC32_TLSDESC 34 word32
R_X86_64_TLSDESC_CALL 35 none
R_X86_64_TLSDESC 36 word64x2

以重定位global_var为例

从上面查看重定位表可知global_var的重定位类型为R_X86_64_PC32,那么修正后的地址值计算公式为S - P + A S = 符号的实际地址 P = 被修正的位置(即重定位入口所要修正的位置) A = 重定位表中的addend字段

查看链接后的可执行程序ab中global_var的虚拟地址:readelf -s ab

Symbol table '.symtab' contains 16 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    ...
    10: 0000000000601018     4 OBJECT  GLOBAL DEFAULT    4 global_var
    ...

S = 0x601018

接下来查看global_var对应重定位入口的偏移量和addend常量,从上面查看的重定位表可以得到

要得到被修正的位置P,我们需要查看段表获取ab中.text段的地址:readelf -S ab

readelf -S ab
There are 9 section headers, starting at offset 0x1258:

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         00000000004000e8  000000e8
       0000000000000029  0000000000000000  AX       0     0     1
    ...

此时我们可以求出P = addr(.text) + offset(global_var) = 0x4000e8 + 0x0a = 0x4000f2

最后求出S - P + A = 0x601018 - 0x4000f2 - 0x04 = 0x200f22

所以修正后的地址值为0x200f22

反编译程序ab:objdump -d ab

objdump -d ab
ab: file format elf64-x86-64
Disassembly of section .text:
00000000004000e8 <main>:
  4000e8:    55     push %rbp
  4000e9:    48 89 e5     mov %rsp,%rbp
  4000ec:    48 83 ec 10     sub $0x10,%rsp
  4000f0:    8b 05 22 0f 20 00     mov 0x200f22(%rip),%eax # 601018 <global_var>
  4000f6:    89 45 fc     mov %eax,-0x4(%rbp)
  4000f9:    b8 00 00 00 00     mov $0x0,%eax
  4000fe:    e8 07 00 00 00     callq 40010a <foo>
  400103:    b8 00 00 00 00     mov $0x0,%eax
  400108:    c9     leaveq 
  400109:    c3     retq 

本机环境为小端

可以看到最终修正后的值确实是0x200f22

TODO:关于符号foo的重定位计算有些疑惑,虽然符号类型是R_X86_64_PLT32,但是重定位后的地址值却符合S + A - P计算规则

参考: 《程序员自我修养》4.2章节-符号解析与重定位 《深入理解计算机系统第3版》7.7章节-重定位 http://www.mindfruit.co.uk/2012/06/relocations-relocations.html#reloc_types_table