chenpengcong / blog

14 stars 3 forks source link

符号地址的确定 #11

Open chenpengcong opened 6 years ago

chenpengcong commented 6 years ago

用以下示例代码作为示例,理解链接器在链接目标文件时是如何确定符号地址

//a.c 
extern int shared;
extern int shared2;

int main()
{
    int a = 100;
    swap(&a, &shared);
    swap(&a, &shared2);
}

//b.c
int shared = 1;

void swap(int *a, int *b)
{
    *a ^= *b ^= *a ^= *b;
}

//c.c
int shared2 = 1;

void swap2(int *a, int *b)
{
    *a ^= *b ^= *a ^= *b;
}

下面分析如何确定各个符号的虚拟地址

首先编译.c文件,分析生成的目标文件的段位置和段长度$ readelf -S a.o b.o c.o

File: a.o
There are 12 section headers, starting at offset 0x330:

Section Headers:
  [Nr] Name              Type            Address          Offset
      Size              EntSize          Flags  Link  Info  Align
  [ 0]                  NULL            0000000000000000  00000000
      0000000000000000  0000000000000000          0    0    0
  [ 1] .text            PROGBITS        0000000000000000  00000040
      0000000000000046  0000000000000000  AX      0    0    1
  [ 2] .rela.text        RELA            0000000000000000  00000258
      0000000000000060  0000000000000018  I      9    1    8
  [ 3] .data            PROGBITS        0000000000000000  00000086
      0000000000000000  0000000000000000  WA      0    0    1
  [ 4] .bss              NOBITS          0000000000000000  00000086
      0000000000000000  0000000000000000  WA      0    0    1
  [ 5] .comment          PROGBITS        0000000000000000  00000086
      0000000000000026  0000000000000001  MS      0    0    1
  [ 6] .note.GNU-stack  PROGBITS        0000000000000000  000000ac
      0000000000000000  0000000000000000          0    0    1
  [ 7] .eh_frame        PROGBITS        0000000000000000  000000b0
      0000000000000038  0000000000000000  A      0    0    8
  [ 8] .rela.eh_frame    RELA            0000000000000000  000002b8
      0000000000000018  0000000000000018  I      9    7    8
  [ 9] .symtab          SYMTAB          0000000000000000  000000e8
      0000000000000138  0000000000000018          10    8    8
  [10] .strtab          STRTAB          0000000000000000  00000220
      0000000000000034  0000000000000000          0    0    1
  [11] .shstrtab        STRTAB          0000000000000000  000002d0
      0000000000000059  0000000000000000          0    0    1

File: b.o
There are 11 section headers, starting at offset 0x268:

Section Headers:
  [Nr] Name              Type            Address          Offset
      Size              EntSize          Flags  Link  Info  Align
  [ 0]                  NULL            0000000000000000  00000000
      0000000000000000  0000000000000000          0    0    0
  [ 1] .text            PROGBITS        0000000000000000  00000040
      000000000000004b  0000000000000000  AX      0    0    1
  [ 2] .data            PROGBITS        0000000000000000  0000008c
      0000000000000004  0000000000000000  WA      0    0    4
  [ 3] .bss              NOBITS          0000000000000000  00000090
      0000000000000000  0000000000000000  WA      0    0    1
  [ 4] .comment          PROGBITS        0000000000000000  00000090
      0000000000000026  0000000000000001  MS      0    0    1
  [ 5] .note.GNU-stack  PROGBITS        0000000000000000  000000b6
      0000000000000000  0000000000000000          0    0    1
  [ 6] .eh_frame        PROGBITS        0000000000000000  000000b8
      0000000000000038  0000000000000000  A      0    0    8
  [ 7] .rela.eh_frame    RELA            0000000000000000  000001f8
      0000000000000018  0000000000000018  I      8    6    8
  [ 8] .symtab          SYMTAB          0000000000000000  000000f0
      00000000000000f0  0000000000000018          9    8    8
  [ 9] .strtab          STRTAB          0000000000000000  000001e0
      0000000000000011  0000000000000000          0    0    1
  [10] .shstrtab        STRTAB          0000000000000000  00000210
      0000000000000054  0000000000000000          0    0    1

File: c.o
There are 11 section headers, starting at offset 0x250:

Section Headers:
  [Nr] Name              Type            Address          Offset
      Size              EntSize          Flags  Link  Info  Align
  [ 0]                  NULL            0000000000000000  00000000
      0000000000000000  0000000000000000          0    0    0
  [ 1] .text            PROGBITS        0000000000000000  00000040
      000000000000002d  0000000000000000  AX      0    0    1
  [ 2] .data            PROGBITS        0000000000000000  00000070
      0000000000000004  0000000000000000  WA      0    0    4
  [ 3] .bss              NOBITS          0000000000000000  00000074
      0000000000000000  0000000000000000  WA      0    0    1
  [ 4] .comment          PROGBITS        0000000000000000  00000074
      0000000000000026  0000000000000001  MS      0    0    1
  [ 5] .note.GNU-stack  PROGBITS        0000000000000000  0000009a
      0000000000000000  0000000000000000          0    0    1
  [ 6] .eh_frame        PROGBITS        0000000000000000  000000a0
      0000000000000038  0000000000000000  A      0    0    8
  [ 7] .rela.eh_frame    RELA            0000000000000000  000001e0
      0000000000000018  0000000000000018  I      8    6    8
  [ 8] .symtab          SYMTAB          0000000000000000  000000d8
      00000000000000f0  0000000000000018          9    8    8
  [ 9] .strtab          STRTAB          0000000000000000  000001c8
      0000000000000013  0000000000000000          0    0    1
  [10] .shstrtab        STRTAB          0000000000000000  000001f8
      0000000000000054  0000000000000000          0    0    1

从输出中我们可以获得以下信息

文件 大小 虚拟地址
a.o .text 0x46 0x00
a.o .data 0x00 0x00
b.o .text 0x4b 0x00
b.o .data 0x04 0x00
c.o .text 0x2d 0x00
c.o .data 0x04 0x00

接下来查看符号表信息$ readelf -s a.o b.o c.o

File: a.o

Symbol table '.symtab' contains 13 entries:
  Num:    Value          Size Type    Bind  Vis      Ndx Name
    0: 0000000000000000    0 NOTYPE  LOCAL  DEFAULT  UND 
    1: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS a.c
    2: 0000000000000000    0 SECTION LOCAL  DEFAULT    1 
    3: 0000000000000000    0 SECTION LOCAL  DEFAULT    3 
    4: 0000000000000000    0 SECTION LOCAL  DEFAULT    4 
    5: 0000000000000000    0 SECTION LOCAL  DEFAULT    6 
    6: 0000000000000000    0 SECTION LOCAL  DEFAULT    7 
    7: 0000000000000000    0 SECTION LOCAL  DEFAULT    5 
    8: 0000000000000000    70 FUNC    GLOBAL DEFAULT    1 main
    9: 0000000000000000    0 NOTYPE  GLOBAL DEFAULT  UND shared
    10: 0000000000000000    0 NOTYPE  GLOBAL DEFAULT  UND _GLOBAL_OFFSET_TABLE_
    11: 0000000000000000    0 NOTYPE  GLOBAL DEFAULT  UND swap
    12: 0000000000000000    0 NOTYPE  GLOBAL DEFAULT  UND shared2

File: b.o

Symbol table '.symtab' contains 10 entries:
  Num:    Value          Size Type    Bind  Vis      Ndx Name
    0: 0000000000000000    0 NOTYPE  LOCAL  DEFAULT  UND 
    1: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS b.c
    2: 0000000000000000    0 SECTION LOCAL  DEFAULT    1 
    3: 0000000000000000    0 SECTION LOCAL  DEFAULT    2 
    4: 0000000000000000    0 SECTION LOCAL  DEFAULT    3 
    5: 0000000000000000    0 SECTION LOCAL  DEFAULT    5 
    6: 0000000000000000    0 SECTION LOCAL  DEFAULT    6 
    7: 0000000000000000    0 SECTION LOCAL  DEFAULT    4 
    8: 0000000000000000    4 OBJECT  GLOBAL DEFAULT    2 shared
    9: 0000000000000000    75 FUNC    GLOBAL DEFAULT    1 swap

File: c.o

Symbol table '.symtab' contains 10 entries:
  Num:    Value          Size Type    Bind  Vis      Ndx Name
    0: 0000000000000000    0 NOTYPE  LOCAL  DEFAULT  UND 
    1: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS c.c
    2: 0000000000000000    0 SECTION LOCAL  DEFAULT    1 
    3: 0000000000000000    0 SECTION LOCAL  DEFAULT    2 
    4: 0000000000000000    0 SECTION LOCAL  DEFAULT    3 
    5: 0000000000000000    0 SECTION LOCAL  DEFAULT    5 
    6: 0000000000000000    0 SECTION LOCAL  DEFAULT    6 
    7: 0000000000000000    0 SECTION LOCAL  DEFAULT    4 
    8: 0000000000000000    4 OBJECT  GLOBAL DEFAULT    2 shared2
    9: 0000000000000000    45 FUNC    GLOBAL DEFAULT    1 swap2

从输出中我们可以获得如下信息

文件 符号 符号值 大小
a.o main 0x00 70
b.o shared 0x00 4
b.o swap 0x00 75
c.o shared2 0x00 4
c.o swap2 0x00 45

链接a.o, b.o, c.o $ ld a.o b.o c.o -e main -o abc

查看abc的段表 $ readelf -S abc

There are 9 section headers, starting at offset 0x12b0:

Section Headers:
  [Nr] Name              Type            Address          Offset
      Size              EntSize          Flags  Link  Info  Align
  [ 0]                  NULL            0000000000000000  00000000
      0000000000000000  0000000000000000          0    0    0
  [ 1] .text            PROGBITS        00000000004000e8  000000e8
      00000000000000be  0000000000000000  AX      0    0    1
  [ 2] .eh_frame        PROGBITS        00000000004001a8  000001a8
      0000000000000078  0000000000000000  A      0    0    8
  [ 3] .got.plt          PROGBITS        0000000000601000  00001000
      0000000000000018  0000000000000008  WA      0    0    8
  [ 4] .data            PROGBITS        0000000000601018  00001018
      0000000000000008  0000000000000000  WA      0    0    4
  [ 5] .comment          PROGBITS        0000000000000000  00001020
      0000000000000025  0000000000000001  MS      0    0    1
  [ 6] .symtab          SYMTAB          0000000000000000  00001048
      00000000000001c8  0000000000000018          7    11    8
  [ 7] .strtab          STRTAB          0000000000000000  00001210
      000000000000005a  0000000000000000          0    0    1
  [ 8] .shstrtab        STRTAB          0000000000000000  0000126a
      0000000000000043  0000000000000000          0    0    1

从输出中我们可以获得如下信息

段名 虚拟地址
.text 0x4000e8
.data 0x601018

获得以上信息后,可以计算出各个符号的地址了,这里我们先手动计算各个符号的地址,再使用readelf查看符号表验证下结果,计算过程如下:

因为链接器是使用相似段合并的策略来合并生成可执行程序abc的,因此生成文件的.text段就是a.o,b.o和c.o的.text段的拼接,那么我们可以计算出a.o的.text段的内容最终的起始虚拟地址为0x4000e8(abc中.text段的虚拟地址) + 0x00(a.o中.text段的偏移) = 0x4000e8,结束地址为0x4000e8 + 0x46(a.o的.text段的大小) = 0x40012e,b.o的.text段的起始地址为0x40012e(a.o中.text段的内容在文件abc中的结束地址) + 0x00(b.o的.text段的偏移) = 0x40012e, 结束地址为0x40012e + 0x4b(b.o的.text段的大小) = 0x400179,c.o的.text段起始地址为0x400179(c.o中.text段的内容在文件abc中的结束地址) + 0x00(c.o中.text段的偏移) = 0x400179,结束地址为0x400179+ 0x2d(c.o的.text段的大小) = 0x4001a6

.data段计算原理一样,,计算结果如下:

而根据符号表的输出结果我们拿到了符号在对应段中的偏移量(符号值),下面以两个符号的计算过程为例

其他符号同理,最终计算结果如下

使用readelf命令验证下结果 $ readelf -s abc

输出如下

Symbol table '.symtab' contains 19 entries:
  Num:    Value          Size Type    Bind  Vis      Ndx Name
    0: 0000000000000000    0 NOTYPE  LOCAL  DEFAULT  UND 
    1: 00000000004000e8    0 SECTION LOCAL  DEFAULT    1 
    2: 00000000004001a8    0 SECTION LOCAL  DEFAULT    2 
    3: 0000000000601000    0 SECTION LOCAL  DEFAULT    3 
    4: 0000000000601018    0 SECTION LOCAL  DEFAULT    4 
    5: 0000000000000000    0 SECTION LOCAL  DEFAULT    5 
    6: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS a.c
    7: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS b.c
    8: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS c.c
    9: 0000000000000000    0 FILE    LOCAL  DEFAULT  ABS 
    10: 0000000000601000    0 OBJECT  LOCAL  DEFAULT    3 _GLOBAL_OFFSET_TABLE_
    11: 000000000040012e    75 FUNC    GLOBAL DEFAULT    1 swap
    12: 0000000000601018    4 OBJECT  GLOBAL DEFAULT    4 shared
    13: 000000000060101c    4 OBJECT  GLOBAL DEFAULT    4 shared2
    14: 0000000000601020    0 NOTYPE  GLOBAL DEFAULT    4 __bss_start
    15: 00000000004000e8    70 FUNC    GLOBAL DEFAULT    1 main
    16: 0000000000400179    45 FUNC    GLOBAL DEFAULT    1 swap2
    17: 0000000000601020    0 NOTYPE  GLOBAL DEFAULT    4 _edata
    18: 0000000000601020    0 NOTYPE  GLOBAL DEFAULT    4 _end

可以看到与计算出来的一致。

参考:《程序员自我修养》4.1