Open chenpengcong opened 6 years ago
用以下示例代码作为示例,理解链接器在链接目标文件时是如何确定符号地址
//a.c extern int shared; extern int shared2; int main() { int a = 100; swap(&a, &shared); swap(&a, &shared2); } //b.c int shared = 1; void swap(int *a, int *b) { *a ^= *b ^= *a ^= *b; } //c.c int shared2 = 1; void swap2(int *a, int *b) { *a ^= *b ^= *a ^= *b; }
下面分析如何确定各个符号的虚拟地址
首先编译.c文件,分析生成的目标文件的段位置和段长度$ readelf -S a.o b.o c.o
$ readelf -S a.o b.o c.o
File: a.o There are 12 section headers, starting at offset 0x330: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00000040 0000000000000046 0000000000000000 AX 0 0 1 [ 2] .rela.text RELA 0000000000000000 00000258 0000000000000060 0000000000000018 I 9 1 8 [ 3] .data PROGBITS 0000000000000000 00000086 0000000000000000 0000000000000000 WA 0 0 1 [ 4] .bss NOBITS 0000000000000000 00000086 0000000000000000 0000000000000000 WA 0 0 1 [ 5] .comment PROGBITS 0000000000000000 00000086 0000000000000026 0000000000000001 MS 0 0 1 [ 6] .note.GNU-stack PROGBITS 0000000000000000 000000ac 0000000000000000 0000000000000000 0 0 1 [ 7] .eh_frame PROGBITS 0000000000000000 000000b0 0000000000000038 0000000000000000 A 0 0 8 [ 8] .rela.eh_frame RELA 0000000000000000 000002b8 0000000000000018 0000000000000018 I 9 7 8 [ 9] .symtab SYMTAB 0000000000000000 000000e8 0000000000000138 0000000000000018 10 8 8 [10] .strtab STRTAB 0000000000000000 00000220 0000000000000034 0000000000000000 0 0 1 [11] .shstrtab STRTAB 0000000000000000 000002d0 0000000000000059 0000000000000000 0 0 1 File: b.o There are 11 section headers, starting at offset 0x268: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00000040 000000000000004b 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 0000000000000000 0000008c 0000000000000004 0000000000000000 WA 0 0 4 [ 3] .bss NOBITS 0000000000000000 00000090 0000000000000000 0000000000000000 WA 0 0 1 [ 4] .comment PROGBITS 0000000000000000 00000090 0000000000000026 0000000000000001 MS 0 0 1 [ 5] .note.GNU-stack PROGBITS 0000000000000000 000000b6 0000000000000000 0000000000000000 0 0 1 [ 6] .eh_frame PROGBITS 0000000000000000 000000b8 0000000000000038 0000000000000000 A 0 0 8 [ 7] .rela.eh_frame RELA 0000000000000000 000001f8 0000000000000018 0000000000000018 I 8 6 8 [ 8] .symtab SYMTAB 0000000000000000 000000f0 00000000000000f0 0000000000000018 9 8 8 [ 9] .strtab STRTAB 0000000000000000 000001e0 0000000000000011 0000000000000000 0 0 1 [10] .shstrtab STRTAB 0000000000000000 00000210 0000000000000054 0000000000000000 0 0 1 File: c.o There are 11 section headers, starting at offset 0x250: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 0000000000000000 00000040 000000000000002d 0000000000000000 AX 0 0 1 [ 2] .data PROGBITS 0000000000000000 00000070 0000000000000004 0000000000000000 WA 0 0 4 [ 3] .bss NOBITS 0000000000000000 00000074 0000000000000000 0000000000000000 WA 0 0 1 [ 4] .comment PROGBITS 0000000000000000 00000074 0000000000000026 0000000000000001 MS 0 0 1 [ 5] .note.GNU-stack PROGBITS 0000000000000000 0000009a 0000000000000000 0000000000000000 0 0 1 [ 6] .eh_frame PROGBITS 0000000000000000 000000a0 0000000000000038 0000000000000000 A 0 0 8 [ 7] .rela.eh_frame RELA 0000000000000000 000001e0 0000000000000018 0000000000000018 I 8 6 8 [ 8] .symtab SYMTAB 0000000000000000 000000d8 00000000000000f0 0000000000000018 9 8 8 [ 9] .strtab STRTAB 0000000000000000 000001c8 0000000000000013 0000000000000000 0 0 1 [10] .shstrtab STRTAB 0000000000000000 000001f8 0000000000000054 0000000000000000 0 0 1
从输出中我们可以获得以下信息
接下来查看符号表信息$ readelf -s a.o b.o c.o
$ readelf -s a.o b.o c.o
File: a.o Symbol table '.symtab' contains 13 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS a.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 3 4: 0000000000000000 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 6 6: 0000000000000000 0 SECTION LOCAL DEFAULT 7 7: 0000000000000000 0 SECTION LOCAL DEFAULT 5 8: 0000000000000000 70 FUNC GLOBAL DEFAULT 1 main 9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND shared 10: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _GLOBAL_OFFSET_TABLE_ 11: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND swap 12: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND shared2 File: b.o Symbol table '.symtab' contains 10 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS b.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 SECTION LOCAL DEFAULT 4 8: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 shared 9: 0000000000000000 75 FUNC GLOBAL DEFAULT 1 swap File: c.o Symbol table '.symtab' contains 10 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS c.c 2: 0000000000000000 0 SECTION LOCAL DEFAULT 1 3: 0000000000000000 0 SECTION LOCAL DEFAULT 2 4: 0000000000000000 0 SECTION LOCAL DEFAULT 3 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 SECTION LOCAL DEFAULT 6 7: 0000000000000000 0 SECTION LOCAL DEFAULT 4 8: 0000000000000000 4 OBJECT GLOBAL DEFAULT 2 shared2 9: 0000000000000000 45 FUNC GLOBAL DEFAULT 1 swap2
从输出中我们可以获得如下信息
链接a.o, b.o, c.o $ ld a.o b.o c.o -e main -o abc
$ ld a.o b.o c.o -e main -o abc
查看abc的段表 $ readelf -S abc
$ readelf -S abc
There are 9 section headers, starting at offset 0x12b0: Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .text PROGBITS 00000000004000e8 000000e8 00000000000000be 0000000000000000 AX 0 0 1 [ 2] .eh_frame PROGBITS 00000000004001a8 000001a8 0000000000000078 0000000000000000 A 0 0 8 [ 3] .got.plt PROGBITS 0000000000601000 00001000 0000000000000018 0000000000000008 WA 0 0 8 [ 4] .data PROGBITS 0000000000601018 00001018 0000000000000008 0000000000000000 WA 0 0 4 [ 5] .comment PROGBITS 0000000000000000 00001020 0000000000000025 0000000000000001 MS 0 0 1 [ 6] .symtab SYMTAB 0000000000000000 00001048 00000000000001c8 0000000000000018 7 11 8 [ 7] .strtab STRTAB 0000000000000000 00001210 000000000000005a 0000000000000000 0 0 1 [ 8] .shstrtab STRTAB 0000000000000000 0000126a 0000000000000043 0000000000000000 0 0 1
获得以上信息后,可以计算出各个符号的地址了,这里我们先手动计算各个符号的地址,再使用readelf查看符号表验证下结果,计算过程如下:
因为链接器是使用相似段合并的策略来合并生成可执行程序abc的,因此生成文件的.text段就是a.o,b.o和c.o的.text段的拼接,那么我们可以计算出a.o的.text段的内容最终的起始虚拟地址为0x4000e8(abc中.text段的虚拟地址) + 0x00(a.o中.text段的偏移) = 0x4000e8,结束地址为0x4000e8 + 0x46(a.o的.text段的大小) = 0x40012e,b.o的.text段的起始地址为0x40012e(a.o中.text段的内容在文件abc中的结束地址) + 0x00(b.o的.text段的偏移) = 0x40012e, 结束地址为0x40012e + 0x4b(b.o的.text段的大小) = 0x400179,c.o的.text段起始地址为0x400179(c.o中.text段的内容在文件abc中的结束地址) + 0x00(c.o中.text段的偏移) = 0x400179,结束地址为0x400179+ 0x2d(c.o的.text段的大小) = 0x4001a6
0x4000e8(abc中.text段的虚拟地址) + 0x00(a.o中.text段的偏移) = 0x4000e8
0x4000e8 + 0x46(a.o的.text段的大小) = 0x40012e
0x40012e(a.o中.text段的内容在文件abc中的结束地址) + 0x00(b.o的.text段的偏移) = 0x40012e
0x40012e + 0x4b(b.o的.text段的大小) = 0x400179
0x400179(c.o中.text段的内容在文件abc中的结束地址) + 0x00(c.o中.text段的偏移) = 0x400179
0x400179+ 0x2d(c.o的.text段的大小) = 0x4001a6
.data段计算原理一样,,计算结果如下:
0x601018 + 0x00
0x601018 + 0x00 = 0x601018
0x601018 + 0x04 = 0x60101c
0x60101c + 0x00
0x60101c + 0x04 = 0x601020
而根据符号表的输出结果我们拿到了符号在对应段中的偏移量(符号值),下面以两个符号的计算过程为例
0x601018(b.o的.data段在abc中的虚拟地址) + 0x00 = 0x601018
0x400179(c.o的.text段在abc中的虚拟地址)+ 0x00 = 0x400179
其他符号同理,最终计算结果如下
使用readelf命令验证下结果 $ readelf -s abc
$ readelf -s abc
输出如下
Symbol table '.symtab' contains 19 entries: Num: Value Size Type Bind Vis Ndx Name 0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 00000000004000e8 0 SECTION LOCAL DEFAULT 1 2: 00000000004001a8 0 SECTION LOCAL DEFAULT 2 3: 0000000000601000 0 SECTION LOCAL DEFAULT 3 4: 0000000000601018 0 SECTION LOCAL DEFAULT 4 5: 0000000000000000 0 SECTION LOCAL DEFAULT 5 6: 0000000000000000 0 FILE LOCAL DEFAULT ABS a.c 7: 0000000000000000 0 FILE LOCAL DEFAULT ABS b.c 8: 0000000000000000 0 FILE LOCAL DEFAULT ABS c.c 9: 0000000000000000 0 FILE LOCAL DEFAULT ABS 10: 0000000000601000 0 OBJECT LOCAL DEFAULT 3 _GLOBAL_OFFSET_TABLE_ 11: 000000000040012e 75 FUNC GLOBAL DEFAULT 1 swap 12: 0000000000601018 4 OBJECT GLOBAL DEFAULT 4 shared 13: 000000000060101c 4 OBJECT GLOBAL DEFAULT 4 shared2 14: 0000000000601020 0 NOTYPE GLOBAL DEFAULT 4 __bss_start 15: 00000000004000e8 70 FUNC GLOBAL DEFAULT 1 main 16: 0000000000400179 45 FUNC GLOBAL DEFAULT 1 swap2 17: 0000000000601020 0 NOTYPE GLOBAL DEFAULT 4 _edata 18: 0000000000601020 0 NOTYPE GLOBAL DEFAULT 4 _end
可以看到与计算出来的一致。
参考:《程序员自我修养》4.1
用以下示例代码作为示例,理解链接器在链接目标文件时是如何确定符号地址
下面分析如何确定各个符号的虚拟地址
首先编译.c文件,分析生成的目标文件的段位置和段长度
$ readelf -S a.o b.o c.o
从输出中我们可以获得以下信息
接下来查看符号表信息
$ readelf -s a.o b.o c.o
从输出中我们可以获得如下信息
链接a.o, b.o, c.o
$ ld a.o b.o c.o -e main -o abc
查看abc的段表
$ readelf -S abc
从输出中我们可以获得如下信息
获得以上信息后,可以计算出各个符号的地址了,这里我们先手动计算各个符号的地址,再使用readelf查看符号表验证下结果,计算过程如下:
因为链接器是使用相似段合并的策略来合并生成可执行程序abc的,因此生成文件的.text段就是a.o,b.o和c.o的.text段的拼接,那么我们可以计算出a.o的.text段的内容最终的起始虚拟地址为
0x4000e8(abc中.text段的虚拟地址) + 0x00(a.o中.text段的偏移) = 0x4000e8
,结束地址为0x4000e8 + 0x46(a.o的.text段的大小) = 0x40012e
,b.o的.text段的起始地址为0x40012e(a.o中.text段的内容在文件abc中的结束地址) + 0x00(b.o的.text段的偏移) = 0x40012e
, 结束地址为0x40012e + 0x4b(b.o的.text段的大小) = 0x400179
,c.o的.text段起始地址为0x400179(c.o中.text段的内容在文件abc中的结束地址) + 0x00(c.o中.text段的偏移) = 0x400179
,结束地址为0x400179+ 0x2d(c.o的.text段的大小) = 0x4001a6
.data段计算原理一样,,计算结果如下:
0x601018 + 0x00
,结束地址:0x601018 + 0x00 = 0x601018
0x601018 + 0x00
,结束地址:0x601018 + 0x04 = 0x60101c
0x60101c + 0x00
,结束地址:0x60101c + 0x04 = 0x601020
而根据符号表的输出结果我们拿到了符号在对应段中的偏移量(符号值),下面以两个符号的计算过程为例
0x601018(b.o的.data段在abc中的虚拟地址) + 0x00 = 0x601018
0x400179(c.o的.text段在abc中的虚拟地址)+ 0x00 = 0x400179
其他符号同理,最终计算结果如下
使用readelf命令验证下结果
$ readelf -s abc
输出如下
可以看到与计算出来的一致。
参考:《程序员自我修养》4.1