Closed pure-water closed 3 years ago
Thanks for opening the issue! I'll keep this open until I clarify this in the docs and disassembler.
The Dt
field (usually) contains two bits. The least-significant, in bit 7 indicates the cache
hint when set (described under "Register Cache" in the docs). I show this in the assembler (as noted) as $
(it's very common in disassembly, so I was avoiding using a verbose .cache
suffix in preference for the cash/cache pun). Since it's already causing confusion I think I will change the default behaviour of the tools to use a .cache
suffix (which the assembler can already accept, or maybe I should go with .reuse
which is used in this paper about a different GPU https://arxiv.org/abs/1903.07486 ? Lots of naming decisions.)
(I don't think it's relevant to your question, but the most-significant bit of Dt
(position 8) encodes the destination size, so the 32-bit per lane r1
gets the value 1
, but the 16-bit per lane r1l
would get the value 0
)
Hi, Thanks for anaswering.
I actually figured the "$" use from you code after 2 hours of follownig code before I read the post. It is here:
CACHE_HINT = '$'
def try_parse_register(s):
flags = []
if s.startswith(CACHE_HINT):
s = s[1:]
flags.append(CACHE_FLAG)
Yes, I agree with the change to use ".cache" directly to be consistent with ".discard" . Therefore we have an uniform operand cache hint scheme. It will be more obvious than "$".
Hi, I was just wondering what is the Dt field purpose for the subscript of the operand? E.g
D = ALUDST(Dx:D,Dt)
My guess is it is just a hint showing the "endian encoding" of the operanding nothing else?
But when I got experiment, it seems otherwise
python3 assemble.py 'fmul r1, r2.neg, 0.5' encoding opcod3: b'\x1a\x85D\n\x02\x00' 1a85440a0200 fmul r1, r2.neg, 0.5
python3 assemble.py 'fmul $r1, r2.neg, 0.5' encoding opcod3: b'\x9a\x85D\n\x02\x00' 9a85440a0200 fmul $r1, r2.neg, 0.5
What is the $r1 and r1 difference as the destination register?