Closed M-a-r-k closed 9 months ago
The first program I found this instruction sequence in was PackDev. Its source code is available for reference.
I think there's only one instance in that particular program, which is where the disassembly fragments above are from. I added ghidra_amiga_ldr to Ghidra to enable it to understand the Amiga hunk executable format. But you could probably just load an executable as a binary image too.
By the way, after Ghidra analyzed PackDev, I did notice an issue with it thinking FUN_00220ee2 does not return. Meaning that all calls to it are "broken" like this after auto-analysis:
00220534 41 ec 01 9a lea (0x19a,A4)=>s_Not_enough_memory_002230de,A0 = "Not enough memory\n"
00220538 61 00 09 a8 bsr.w FUN_00220ee2 undefined FUN_00220ee2()
-- Flow Override: CALL_RETURN (CALL_TERMINATOR)
0022053c 70 ?? 70h p
0022053d 14 ?? 14h
0022053e 60 ?? 60h `
0022053f 00 ?? 00h
00220540 09 ?? 09h
00220541 98 ?? 98h
LAB_00220542 XREF[1]: 00220532(j)
00220542 20 2c 1c 94 move.l (0x1c94,A4)=>DAT_00224bd8,D0
00220546 29 40 1b f8 move.l D0,(0x1bf8,A4)=>DAT_00224b3c
I can open a separate issue for that if needed.
- have auto analysis detect and automatically mark instructions as "split"
Do to the frequent occurance of bad offcut flows we are currently not considering any generalized attempt at performing such fixups automatically.
- if the user manually does it, automatically disassemble the bytes (70 00 here) rather than leaving as data.
At this point in time the action is intended for more surgical manipulation and we do not want to tie auto-disassembly to the action. We are considering the intoduction of a script (which could be attached to a key-binding) which will attempt all modifications for an instruction which contains an offcut flow to produce overlapping instructions.
I noticed this item in the Ghidra 10.4 change history: "Listing. Added ability to reduce an instructions length to facilitate overlapping instructions. This can now be accomplished by specifying an instruction length override on the first instruction and disassembling the bytes which follow it. The need for this has been observed with x86 where there may be a flow around a LOCK prefix byte. (GP-3256)"
Many years ago I disassembled some Motorola 68000 code which did something kind of related. Some versions of the SAS/C compiler for the Commodore Amiga can generate code sequences like this:
The effect of that to put either 123 or 0 in D0. In that example, if 123 is put in D0, the
cmpi.w #$7000,D0
instruction is effectively a no-op (following code doesn't depend on condition codes), so code execution falls through.However if the branch to label+2 is taken, that does
moveq #0,D0
before continuing. Effectively, it's a minor optimization of this:(
cmpi.w
executes faster thanbra.b
I guess.)In Ghidra, one example disassembly initially shows:
After right-clicking
cmpi.w #0x7000,D0w
, choosing "Modify Instruction Length..." and setting length to 2 bytes, the listing looks like this:The bytes at LAB_0021f362 are not disassembled by default, even though LAB_00212362 is a jump destination.
This issue is to suggest a couple of improvements:
If anyone is interested in working on that I can upload some Amiga executables which contain those code sequences.