Open zznop opened 5 years ago
Looks like the SLEIGH code for 68k should be masking off the most significant byte. I'd suggest checking the Motorola documentation to see if that byte has any special meaning.
It would also be useful to add a warning comment or annotation when this is encountered, but I don't know if SLEIGH can handle this.
In the case described above the opcode is 11 d8 fe 80
the address fe 80
should be extended to 24-bits (0xfffe80
) because the 68000 contains a 24-bit address bus. But because the 68000 interface is written for 32-bit wide addresses (later 68000 series processors) it is extended to 0xfffffe80
. I've looked at the 68000.sinc
file and attempted to make modifications to the moves.w
and moves.b
lines to remove the top byte, but there's more to it than that as other IDL files contain information on address sizes. Also, by applying those masks explicitly it will break the sla interfaces for the later processors. Idk, if the 68000 needs its own *.sinc
file or what. I'd like to hear from a Ghidra core dev.
Okay, I think that @zznop is not correct.
Now, I don't have a plain 68000 programmer's manual, only 020+ and the usual all-in-one PDF that floats around the internet, but at least there the following is stated for short addressing mode:
2.2.16 Absolute Short Addressing Mode
In this addressing mode, the operand is in memory, and the address of the operand is in the extension word. The 16-bit address is sign-extended to 32 bits before it is used. .
So internally the 68000 core will sign extend the address to 32 bits. The fact that the 68000 only has 24 address pins exposed is just a shortcoming of the chip.
To give a more concrete example: The Atari ST and STE both have a 68000 CPU installed. Hardware registers are mapped from $f00000 to $ffffff (i.e. around the 15 and 16mb address space). So if we want to access the background palette register (for example) we can write:
move.w #$fff,$ff8240
and
move.w #$fff,$ffff8240.w
They're both equivalent. But also we can also write:
lea $ffff8240.w,a0
move.w #$fff,(a0)
Especially the last one proves that one can pretty much stick whatever they want on the higher byte of the address and the CPU will still access the correct 24-bit address. How could it otherwise? Only the 24 lower bits are exposed to the bus.
So, like I said, I believe this is not a bug, and should be closed.
@ggnkua What you are saying is that when analyzing 68000 code the user should know to just ignore the top 8 bits of 0xfffffe80
for a move.b (A0)+,(DAT_fffffe80).w
(for MC68000)? Ok, that's all great until you're in Ghidra and your xref's don't work. This effects auto analysis both from an aesthetic point of view and functionally. If I create a RAM segment from 0xff0000
to 0xffffff
at 0xfffe80
there should be a label and cross reference, but there's not because Ghidra thinks move.b (A0)+,(DAT_fffffe80).w
is referencing 0xffffe80
. This is very much an issue. IDA handles it correctly and now Binary Ninja's processor module handles it correctly after I worked with their devs to fix the same issue.
Also, to reaffirm the issue, if you look at the 68000.ldefs file. 68000:BE:32:default
uses the 68040
sla file. They don't have true support for native 68000 at all. Just the instruction set as it pertains to the later 68k 32-bit processors.
Right, I see your point. As a workaround how about creating another ram segment from $ffff0000 to $ffffffff? If the code accesses the hardware registers consistantly (i.e. either $ffxxxx or $xxxx.w) then it won't matter which bucket of RAM it ends up. Worst case you have to check in both memory areas for xrefs.
Thanks, but I'll just keep using Binary Ninja. I just wanted to bring this to the attention of the devs. You are 100% correct in that move.b (%a0)+, (0xfffffe80)
, and move.b (%a0)+, (0xfffe80)
assemble to the same opcode. But because Ghidra is more than just a disassembler (unlike capstone or objdump), it has to take in account processor behavior for things like xref's, decompilation, type propagation, and more. I recommend one of two things:
68000:BE:32:default
from the processor list and add a MC68040
(in other words, don't advertise native 68000 support)I found this related issue: https://github.com/NationalSecurityAgency/ghidra/issues/230
I am also encountering this “issue” using a CPU32 processor, which also has a 24-but physical bus. While I agree the current implementation is technically correct, it does make the disassembly more messy.
Is there an example of how to mask out the byte in SLEIGH? I’ve attempted but I’m can’t find a solution.
And how does the byte become FF? Does SLEIGH use an implicit SEXT instruction to make the address 32 bits wide?
Describe the bug
Ghidra incorrectly disassembles certain
move*
68000 instructions, specifically instructions where the addresses are sign extended.For example, Ghidra disassembles the instruction below (from IDA output):
as
"The 68000 has a 24‑bit address space, allowing access to up to 16 MB of memory." This differs from later 6800* variants that have 32-bit address buses
To Reproduce Steps to reproduce the behavior: 1) In a hex editor, create a binary with
11 d8 fe 80
and save it 2) Throw it into a ghidra project and selectM6800 default
for the processor 3) Go to byte zero and hit 'D'Expected behavior
Ghidra to display the instruction as:
(24-bit addressing)