simonlindholm / asm-differ

Assembly diff script
The Unlicense
101 stars 50 forks source link

Limit relocations to current line #168

Open mkst opened 1 month ago

mkst commented 1 month ago

Closes #167.

Is there a better way to solve this? We use the line number of the reloc if there is one. It solves the issue I'm seeing and didnt appear to break the other scratches I have locally.

simonlindholm commented 1 month ago

Is the reloc from some other section? If so this is not a great fix, because the line number could match up by accident... If we're willing to accept that the PR looks good though.

mkst commented 1 month ago

The reloc is from earlier in the .text section.. I've raised a bug on bugzilla but I'm not particularly hopeful it'll get fixed.

Are line numbers per-section? I.e. for MWCC with multiple .text sections, do line numbers start at 0x0 for each one? I guess I can try and answer that myself...

Edit

Looking at a mwccpp-compiled file, the previous text sections are going to be empty if we pass --disassemble=FUNCTION. So the troublesome scenario would be if we are disassembling a whole object that has multiple .text sections with relocs...

However, wouldn't asm-differ be grabbing the wrong relocs anyway (on master) ?

$ mips-linux-gnu-objdump --disassemble=func_801916C4 --reloc build/pspeu/src/st/wrp_psp/e_misc.c.o 

build/pspeu/src/st/wrp_psp/e_misc.c.o:     file format elf32-tradlittlemips

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

00000000 <func_801916C4>:
   0:   27bdffd0    addiu   sp,sp,-48
   4:   afbf001c    sw  ra,28(sp)
   8:   afb30018    sw  s3,24(sp)
   c:   afb20014    sw  s2,20(sp)
  10:   afb10010    sw  s1,16(sp)
  14:   afb0000c    sw  s0,12(sp)
  18:   a7a40020    sh  a0,32(sp)
  1c:   3c020000    lui v0,0x0
            1c: R_MIPS_HI16 g_CurrentEntity
  20:   8c420000    lw  v0,0(v0)
            20: R_MIPS_LO16 g_CurrentEntity
  24:   84420002    lh  v0,2(v0)
  28:   7c021620    0x7c021620
  2c:   2452ff80    addiu   s2,v0,-128
  30:   02402021    move    a0,s2
  34:   0c000000    jal 0 <func_801916C4>
            34: R_MIPS_26   abs
  38:   00000000    nop
  3c:   2442ffe0    addiu   v0,v0,-32
  40:   00021143    sra v0,v0,0x5
  44:   7c021620    0x7c021620
  48:   7c028e20    0x7c028e20
  4c:   7c111620    0x7c111620
  50:   28410009    slti    at,v0,9
  54:   14200005    bnez    at,6c <func_801916C4+0x6c>
  58:   00000000    nop
  5c:   24020008    li  v0,8
  60:   7c028e20    0x7c028e20
  64:   10000005    b   7c <func_801916C4+0x7c>
  68:   00000000    nop
  6c:   7c111620    0x7c111620
  70:   04410002    bgez    v0,7c <func_801916C4+0x7c>
  74:   00000000    nop
  78:   00008821    move    s1,zero
  7c:   06410005    bgez    s2,94 <func_801916C4+0x94>
  80:   00000000    nop
  84:   7c111620    0x7c111620
  88:   00021023    negu    v0,v0
  8c:   7c021620    0x7c021620
  90:   7c028e20    0x7c028e20
  94:   02402021    move    a0,s2
  98:   0c000000    jal 0 <func_801916C4>
            98: R_MIPS_26   abs
  9c:   00000000    nop
  a0:   2442ffa0    addiu   v0,v0,-96
  a4:   7c021620    0x7c021620
  a8:   7c028620    0x7c028620
  ac:   3c020000    lui v0,0x0
            ac: R_MIPS_HI16 g_CurrentEntity
  b0:   8c420000    lw  v0,0(v0)
            b0: R_MIPS_LO16 g_CurrentEntity
  b4:   84420006    lh  v0,6(v0)
  b8:   7c021620    0x7c021620
  bc:   2444ff80    addiu   a0,v0,-128
  c0:   0c000000    jal 0 <func_801916C4>
            c0: R_MIPS_26   abs
  c4:   00000000    nop
  c8:   2453ff90    addiu   s3,v0,-112
  cc:   1a600004    blez    s3,e0 <func_801916C4+0xe0>
  d0:   00000000    nop
  d4:   7c131e20    0x7c131e20
  d8:   02031821    addu    v1,s0,v1
  dc:   7c038620    0x7c038620
  e0:   7c101e20    0x7c101e20
  e4:   04610002    bgez    v1,f0 <func_801916C4+0xf0>
  e8:   00000000    nop
  ec:   00008021    move    s0,zero
  f0:   7c101e20    0x7c101e20
  f4:   00032043    sra a0,v1,0x1
  f8:   2403007f    li  v1,127
  fc:   00641823    subu    v1,v1,a0
 100:   7c031e20    0x7c031e20
 104:   7c038620    0x7c038620
 108:   7c101e20    0x7c101e20
 10c:   18600009    blez    v1,134 <func_801916C4+0x134>
 110:   00000000    nop
 114:   87a20020    lh  v0,32(sp)
 118:   7c022620    0x7c022620
 11c:   7c102e20    0x7c102e20
 120:   7c113620    0x7c113620
 124:   3c020000    lui v0,0x0
            124: R_MIPS_HI16    g_api
 128:   8c4200e4    lw  v0,228(v0)
            128: R_MIPS_LO16    g_api
 12c:   0040f809    jalr    v0
 130:   00000000    nop
 134:   8fbf001c    lw  ra,28(sp)
 138:   8fb30018    lw  s3,24(sp)
 13c:   8fb20014    lw  s2,20(sp)
 140:   8fb10010    lw  s1,16(sp)
 144:   8fb0000c    lw  s0,12(sp)
 148:   27bd0030    addiu   sp,sp,48
 14c:   03e00008    jr  ra
 150:   00000000    nop
simonlindholm commented 1 month ago

Looking at a mwccpp-compiled file, the previous text sections are going to be empty if we pass --disassemble=FUNCTION. So the troublesome scenario would be if we are disassembling a whole object that has multiple .text sections with relocs...

This feels like it could be a pretty common case. I imagine in this case the relocs all end up next to the first instruction of the --disassemble'd function? Can you test it? E.g. add a function int global; int* get_addr(void) { return &global; } just above. If they don't then this PR should be good as is.

However, wouldn't asm-differ be grabbing the wrong relocs anyway (on master) ?

I don't think it would? Or does objdump put relocs in the wrong .text section? (I could see that happening, but it's hard to guess. I vaguely recall some problem with multiple .text sections but I don't remember what it was...)

mkst commented 1 month ago

There are two scenarios for the --disassemble=FUNCTIONNAME.

  1. Single .text section (e.g. gcc, target.o.zip) - buggy objdump, bad asm-differ behaviour.
$ mips-linux-gnu-objdump --disassemble=eMessageCpy --reloc target.o

target.o:     file format elf32-tradlittlemips

Disassembly of section .text:

00001150 <eMessageCpy>:
    1150:   27bdfff0    addiu   sp,sp,-16
            4: R_MIPS_GPREL16   D_004DA918
            a0: R_MIPS_26   xglFontGetSPcodeSize
            f0: R_MIPS_26   .text
            15c: R_MIPS_26  .text
            1c8: R_MIPS_26  xglFontGetSPcodeSize
            210: R_MIPS_26  .text
            264: R_MIPS_HI16    D_00530900
            26c: R_MIPS_LO16    D_00530900
            318: R_MIPS_HI16    D_0036C370
            324: R_MIPS_LO16    D_0036C370
            580: R_MIPS_26  .text
            5d8: R_MIPS_26  xglFontGetSPcodeSize
            6b4: R_MIPS_26  xglFontGetStringWidth
            750: R_MIPS_26  xglFontGetSPcodeSize
            9b0: R_MIPS_GPREL16 D_004DA918
            9c0: R_MIPS_HI16    D_004DA920
            9c4: R_MIPS_HI16    D_004DA928
            9cc: R_MIPS_LO16    D_004DA920
            9e0: R_MIPS_LO16    D_004DA928
            a0c: R_MIPS_HI16    D_00531900
            a10: R_MIPS_LO16    D_00531900
            acc: R_MIPS_26  endSpriteSet
            bc8: R_MIPS_26  endPrintExtFunc
            bdc: R_MIPS_GPREL16 D_004DA918
            c00: R_MIPS_GPREL16 D_004DA918
            ca4: R_MIPS_26  xglFontGetStringWidth
            cd8: R_MIPS_26  .text
            d38: R_MIPS_HI16    D_00530900
            d40: R_MIPS_LO16    D_00530900
            d58: R_MIPS_26  xglFontPrint
            db8: R_MIPS_26  eCursolModeChange
            dc0: R_MIPS_26  eCursolMain
            fdc: R_MIPS_26  .text
            1064: R_MIPS_26 eCursolSet
            10a8: R_MIPS_26 .text
            1144: R_MIPS_26 .text
    1154:   af840000    sw  a0,0(gp)
            1154: R_MIPS_GPREL16    D_004DC5D4
    1158:   ffbf0000    sd  ra,0(sp)
    115c:   00a0202d    move    a0,a1
    1160:   dfbf0000    ld  ra,0(sp)
    1164:   0800045c    j   1170 <eMessageCat>
            1164: R_MIPS_26 .text
    1168:   27bd0010    addiu   sp,sp,16

The relocs are included (per bugzilla bug) but the OFFSETS start from the beginning of the file, so there is no chance of a mismatch.

  1. Multiple .text sections (e.g. mwccpsp, e_misc.c.o.zip)
$ mips-linux-gnu-objdump --reloc --disassemble=EntityIntenseExplosion e_misc.c.o 

e_misc.c.o:     file format elf32-tradlittlemips

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

Disassembly of section .text:

00000000 <EntityIntenseExplosion>:
   0:   27bdffe0    addiu   sp,sp,-32
   4:   afbf000c    sw  ra,12(sp)
   8:   afa40010    sw  a0,16(sp)
   c:   8fa30010    lw  v1,16(sp)
  10:   9463002c    lhu v1,44(v1)
  14:   14600031    bnez    v1,dc <EntityIntenseExplosion+0xdc>
  18:   00000000    nop
  1c:   3c040000    lui a0,0x0
            1c: R_MIPS_HI16 g_InitializeEntityData0
  20:   24840000    addiu   a0,a0,0
            20: R_MIPS_LO16 g_InitializeEntityData0
  24:   0c000000    jal 0 <EntityIntenseExplosion>
            24: R_MIPS_26   InitializeEntity
  28:   00000000    nop
  2c:   34048170    li  a0,0x8170
  30:   8fa30010    lw  v1,16(sp)
  34:   a4640016    sh  a0,22(v1)
  38:   24040005    li  a0,5
  3c:   8fa30010    lw  v1,16(sp)
  40:   a4640054    sh  a0,84(v1)
  44:   24040001    li  a0,1
... <snip>

The full output of mips-linux-gnu-objdump --reloc --disassemble e_misc.c.o shows that the relocation offsets are from the start of each .text section. SO if asm-differ is used with --disassemble (rather than --disassemble=FUNCTIONNAME there is a chance for a clash - assuming asm-differ starts from the beginning of the objdump output, rather than the start of the function being diffed).

I imagine in this case the relocs all end up next to the first instruction of the --disassemble'd function? I think the logs I've pasted show this is not the case?

mkst commented 1 week ago

Do you have any thoughts on this PR - is there a different way to solve the issue?

simonlindholm commented 1 week ago

I think it might be a good fix, but I haven't had time to do any kind of decomp-related work for the last two months or so, and so haven't thought deeply enough about it... It does feel like my schedule might be clearing up though so maybe I can actually get back to this soon.