mattcurrie / mgbdis

Game Boy ROM disassembler with RGBDS compatible output
MIT License
248 stars 38 forks source link

Lookup symbols when disassembling operand addresses #6

Closed kemenaran closed 6 years ago

kemenaran commented 6 years ago

Currently symbols are only looked-up for operands of jp, jr and call instructions.

This PR allows to also lookup symbols for operands in data-related instructions.

This is of course mostly useful when disassembling a new bank for an already partially-disassembled program. It doesn't change anything for disassembling a ROM from scratch, without declared symbols.

Example

Given a sample symbol file:

; rom.sym
00:3100 ResetCounter
00:D001 wMusicTrack
00:FFF1 hFrameCounter
00:FFF2 hMapIndex

Before

    ld e, [$D001]
    ld b, $FFE0
    ld hl, $FFF1
    inc [hl]
    xor a
    ld [$FF00+F2], a
    call ResetCounter

After

    ld e, [wMusicTrack]
    ld b, $FFE0  ; no symbol defined; keep the hexadecimal value
    ld hl, hFrameCounter
    inc [hl]
    xor a
    ldh [hMapIndex], a
    call ResetCounter
kemenaran commented 6 years ago

Of course, when disassembling a ROM using some of these data-location symbols in the SYM file, the symbols are expected to be actually declared in order to compile back the disassembly:

; hram.asm
SECTION "HRAM", HRAM[$FF80]

; Unlabeled
ds $71

hFrameCounter:: ds 1 ; $FFF1
hMapIndex::     ds 1 ; $FFF2
mattcurrie commented 6 years ago

Hey there, sorry for the delay in responding!

One of my goals with the project was to be able to just run make afterwards to generate identical ROM output from the disassembly - as you've noted this deviates away from that a bit by requiring the user to declare these symbols afterwards to build the ROM.

I'll merge this but would be nice to consider if there's a good way to handle these so that we could generate output that is still buildable. Perhaps we could later add support for defining the symbols with (an optional?) length and then generate a source file with appropriate symbols in the output.

The input symbol definition could look something like this:

00:FFF1 hFrameCounter
00:FFF1 .var:1
00:FFF2 hMapIndex
00:FFF2 .var:1
kemenaran commented 6 years ago

Oh, right.

Actually I got my explanations a bit backwards. The idea is less “If you have data symbols in a SYM file, you'll need to define them in a source file” (which assumes the data symbols existed only in the SYM file), but more like “If you already labeled data in a partial disassembly, they will be written to the SYM file when compiling, and mgbdis can use them to disassemble another part of the code”.

In the intended use case the data symbols are expected to have been generated from a partial disassembly, so the source file with the variables defined already exists somewhere.

That said, you're right, it should be possible to generate a source file automatically with the declarations. A first step could even assume that a data label references the first byte at the address (so that a .var:1 declaration would not be needed); then it would just need to add padding for non-contiguous addresses. I'll have a look at this.