kstenerud / Musashi

Motorola 680x0 emulator written in C
427 stars 96 forks source link

Disassembler: Incorrect disassembly of mnemonics #40

Open SteveFosdick opened 5 years ago

SteveFosdick commented 5 years ago

In 68020 mode, and maybe others too, the assemble is incorrectly disassembling some instructions. Using as an example the following code in ROM, the assembler reports the assembled code as:

F00:0330        MOVE.L  #vector_table, A0                 ; Destination address
               S03:FFFFFFFFFFFF0200:  91 C8
F00:0331        MOVE.L  #bios_address, A1                 ; Source address
               S03:FFFFFFFFFFFF0202:  22 7C FF FF 00 00
F00:0332        MOVE.W  #$3F, D0                          ; Number of system vectors minus one
               S03:FFFFFFFFFFFF0208:  30 3C 00 3F
F00:0333       init_sysvectors
F00:0334        MOVE.L  (A1)+, (A0)+                      ; Move data
               S03:FFFFFFFFFFFF020C:  20 D9
F00:0335        DBF D0, init_sysvectors
               S03:FFFFFFFFFFFF020E:  51 C8 FF FC
F00:0336                                                  ; Set all user vectors to 'Unknown Exception' handler
F00:0337        MOVE.W  #$BF, D0                          ; Number of user vectors minus one
               S03:FFFFFFFFFFFF0212:  30 3C 00 BF
F00:0338       init_uservectors
F00:0339        MOVE.L  #unknown_exception, (A0)+         ; Move data
               S03:FFFFFFFFFFFF0216:  20 FC FF FF 07 34
F00:0340        DBF D0, init_uservectors
               S03:FFFFFFFFFFFF021C:  51 C8 FF F8
F00:0341       
F00:0342        MOVE.W  #$3F, D0                          ; Number of words minus one
               S03:FFFFFFFFFFFF0220:  30 3C 00 3F

In each case the generated output is on the line immediately after the line being assembled. This is written to an ROM being emulated in an emulator with the corresponding section of the ROM having the following hexdump:

00000200 - 91 C8 22 7C FF FF 00 00 30 3C 00 3F 20 D9 51 C8 .."|....0<.? .Q.
00000210 - FF FC 30 3C 00 BF 20 FC FF FF 07 34 51 C8 FF F8 ..0<.. ....4Q...
00000220 - 30 3C 00 3F

The m68kdasm module disassembles this as:

    FFFF0200: dc.w $00c8; ILLEGAL
    FFFF0202: ori     #$ff, SR
    FFFF0206: ori.b   #$3c, D0
    FFFF020A: dc.w $003f; ILLEGAL
    FFFF020C: dc.w $00d9; ILLEGAL
    FFFF020E: dc.w $00c8; ILLEGAL
    FFFF0210: dc.w $00fc; ILLEGAL
    FFFF0212: ori     #$bf, CCR
    FFFF0216: dc.w $00fc; ILLEGAL
    FFFF0218: dc.w $00ff; ILLEGAL
    FFFF021A: ori.b   #$c8, (-$8,A4,D0.w)
    FFFF0220: ori     #$3f, CCR

There is some similarity suggesting it is looking at the right region in memory but seems to be getting the instruction mnemonics completely wrong.

SteveFosdick commented 5 years ago

On tracing what is happening with gdb I came upon this:

m68k_disassemble (str_buff=0x7fffffffe10a "", pc=pc@entry=4294902272, 
    cpu_type=cpu_type@entry=4) at musahi/m68kdasm.c:3482
3482        g_cpu_ir = read_imm_16();
(gdb) s
dasm_read_imm_16 (advance=2) at musahi/m68kdasm.c:273
273         result = m68k_read_disassembler_16(g_cpu_pc & g_address_mask) & 0xff;

So this is correctly reading a 16bit opcode from memory but is then discarding the most significant byte. That means for any opode with a non-zero most significant byte the wrong mnemonics will be reported.

Applying the attached patch corrects this and gives the following output for the same section of code:

    FFFF0200: suba.l  A0, A0
    FFFF0202: movea.l #$ffff0000, A1
    FFFF0208: move.w  #$3f, D0
    FFFF020C: move.l  (A1)+, (A0)+
    FFFF020E: dbra    D0, $ffff020c
    FFFF0212: move.w  #$bf, D0
    FFFF0216: move.l  #$ffff0734, (A0)+
    FFFF021C: dbra    D0, $ffff0216
    FFFF0220: move.w  #$3f, D0
    FFFF0224: movea.w #$400, A0
    FFFF0228: movea.l #$ffff0100, A1
    FFFF022E: move.l  (A1)+, (A0)+

This is an improvment, the the first instruction still appears incorrect. m68k_dmasm_patch1.txt

SteveFosdick commented 5 years ago

On checking the 68000 programmer's reference it seems the first instruction being disassembled as sub.l A0, A0 is correct. The address 'vector_table' in the instruction as presented to the assembler is zero and the assembler has optimised this to the subtract instead.