Open shingarov opened 6 months ago
@janvrany what happens to them on RISC-V when we encode split-fields, is it the "assembly source" or the "value" that gets into these set_asm
operand constraints?
It in most complicated cases it is both. For example, consider J-type instruction where 20bit displacement is encoded in 4 different fields and the encoded binary value is shifted one bit right (i.e, you do not encode last bit because it is always zero). The set_asm
statement looks like:
xxx.set_asm("xxx %gpr, %exp(bimm16)", rd, imm1+imm2+imm3+imm4)"
Here bimm16
modifier does the shifting and imm1+imm2+imm3+imm4
does the splitting
IIUC the issue in POWER can be solved by either:
set_asm
do something like set_asm("xxx %imm(ppcsprnum)", LR
%spr:10
something like %spr_lo:5 %spr_hi:5
and then in in set_asm
do something like `set_asm("xxx %imm", spr_hi+spr_lo")That being said, there's so many issues with POWER PDL that I'm disinclined to put much effort into "fixing" it. We should (re)start thinking of sourcing POWER model from authoritative sources, like we do for RISC-V.
issue in POWER can be solved by
Well, in practical terms (i.e. using these SPR numbers) it's not a real issue, because the ISA definition guarantees that mfspr
and mtspr
have extended mnemonics for every defined SPR. Now, SPR numbers are processor- / profile-specific, so realistically we are bound by #34.
What does feel scary to me, is that n = spr[5:9] || spr[0:4]
line in the Green Cloth. It suggests to me that we no longer have a clear concept of "instruction". Say, what is mflr
? Is it mfspr 8
, which is the undoubtedly the canonical assembly for it (and consistent with p.387); or is it mfspr 0x100
, which is "mfspr
with spr field at 0x100" (consistent with the pseudocode)? It looks like the definitions of assembly is still left completely informal.
We should (re)start thinking of sourcing POWER model from authoritative sources, like we do for RISC-V.
That's the thing. It appears there is no "authoritative source" for a lot of things. Even if we were to parse the BookMaster source of the instruction encoding tables and be extremely successful at it — so we would have "the authoritative instruction encoding" — we are still left with prose explanations of assembly, such as that "Compiler and Assembler Note" in the right margin of p.385.
The Sail definition of RISC-V seems no better, it's a function from a binary program to a behavior, really with no clear definiton of "instruction" as in something like instr = add | lw | sw | …
.
It suggests to me that we no longer have a clear concept of "instruction"
It was never "clear", there are multiple ways how to model instructions in ArchC language and always been. What is defined is assembly at one end and encoding at the other - everything in between can be done in multiple ways. Note also, that as code stands now it assumes there are some other concepts, like opcode field but really, there's no such thing.
The Sail definition of RISC-V seems no better
True, but in case of RISC-V those opcodes
files maintained by RISC-V foundation and "more authoritative" that if we just write ArchC PDL by hand. So by "authortitative" sources I meant really "more authoritative" than what we have now. Parsing BookMaster sources would give us that "more". All that w.r.t. encoding / decoding. Poor quality of POWER PDL is one thing that holds Tinyrossa's POWER backend back.
It was never "clear", there are multiple ways how to model instructions in ArchC language
That's not what I am talking about. The problem is not ArchC. The problem is the Green Cloth (and whatever other "authoritative sources" we have for any number of ISAs):
Parsing BookMaster sources would give us that "more"
Only in the limited example of instruction-encodings, but not for other aspects such as assembly. Case in point: that right margin of p.385, written in the kind of "prophetic English" which I was talking about in the Cambridge-2014 talk ("…don't step on the rake…boom!"). I would go as far as saying that the ArchC language is actually better than the Green Cloth even though it's wrong, because the Green Cloth is informal no matter how you parse it.
Now, I am not contesting that the current quality of the POWER ArchC PDL is unusable for what we are trying to do. What I am saying is that parsing Green Cloth will not help, no matter how sophisticated parsing technology.
On PowerPC, for compatibility with old POWER the encoding of the SPR number in
mtspr
,mfspr
is reversed, see Green Cloth p.385:This explanation, combined with the pseudocode:
and the diagram of the XFX-form on p.384, is extremely confusing.
Be it as it may, both
rs
/rt
andspr
operand modifiers in the PDL are simply%imm
and this obviously results in emitting the wrong code. The bug is especially vicious in the presence of extended mnemonics, because the speciallr
,ctr
,xer
have the reversed encoding baked in theset_asm
, e.g.LR=8
:because
00000 01000
⇔01000 00000
.