Closed sp1187 closed 6 years ago
Is this documented anywhere? I've seen retail code in a PS2 game passing an immediate to break in a way that is consistent with the current implementation, i.e. in the lower bits. IDA also disassmbles it that way with a single argument. So the bug may actually be inside gas, or it's just fully up to the implementation.
It has been quite hard to find actual documentation (as in not code) for this, but section 8.6 in See MIPS Run has a table claiming that the code field in break is indeed 10 bits and in the upper half of the instruction.
And even with this change, you can still control the lower bits with the second argument, so break 0, 7
does the same as the old break 7
.
On the other hand, this reference only mentions a single field: http://www.cs.cmu.edu/afs/cs/academic/class/15740-f97/public/doc/mips-isa.pdf I'm not convinced it's the best thing to change the single parameter version for that reason, especially as existing code may depend on it.
Since I am the originator of #122, I figured I'd weigh in with what I've been able to ascertain.
As indicated in the MIPS IV ISA Revision 3.2, the break
instruction has bits 6-25 set aside for the code
field. The code
field is not handled by hardware at all. In fact, if the exception handler wants to read what the value of code
is, it will first need to fetch the address of the exception, then read the instruction word from memory at this address and mask and shift out the code field. This means the interpretation of the code
field is entirely up to the software.
See MIPS Run defines it as one 10-bit field in the most significant bits. For one parameter, GNU binutils assembles it this way as well. Even the Nintendo SDK assembler (GNU as 2.6) assembles break 0xc
into 000c000d
. If an optional second parameter is provided, GNU as inserts it into the least significant 10-bits of the code
field:
$ echo "break 0xc; break 0xa, 0x5" | mips64-elf-as -o break.o - && mips64-elf-objdump -d break.o
<snip>
00000000 <.text>:
0: 000c000d break 0xc
4: 000a014d break 0xa,0x5
This is also how the capstone engine (based off of the LLVM) disassembles the data:
$ echo 000c000d000a014d | xxd -r -p > break.bin && rasm2 -a mips -e -D -Bf break.bin
0x00000000 4 000c000d break 0xc
0x00000004 4 000a014d break 0xa, 5
So which is correct? As far as I can tell, it is entirely up to the assembler and system programmer authors. Beyond See MIPS Run, I haven't seen a formal definition in writing. If it were me, I'd side with with GNU binutil's and LLVM implementation.
Not sure if it matters, but PSP games often end up with:
000001cd break 7 or proposed break 0, 7
0000000d break 0
I don't think I've seen it as the upper bits, which may imply that CodeWarrior (PS2, PS1, PSP, right?) didn't separate it into two 10 bit chunks.
What's most commonly seen in N64 games? (sorry originally said DS which is ARM of course.)
-[Unknown]
It appears that the consensus here is that the behaviour of break
and syscall
will stay at it is for compatibility reasons, so I will close this issue.
Fixes #122.