Closed asb closed 2 years ago
Arguably, sfence.vma operates on the reference (which can affect the data as a side-effect), whereas cbo.* operates on the data, so that would be a justification for why sfence.vma doesn't have parentheses.
Just a ping to see if anyone had thoughts on this one way or another?
I’ll support @jrtc27’s take. It’s a minor point either way, but there’s little reason to linger on this topic.
No argument from me that in a priority ordered list of RISC-V issues, this probably comes out pretty low! I'd just like to get the assembler support done without having to revisit it in the future.
Following this discussion, I've modified my LLVM MC layer zicbom+zicboz patch to use the (rs1)/0(rs1) format. Thanks!
There are two Binutils patches from January on the Binutils list, that use the plain format (cbo.clean a0
).
E.g. https://sourceware.org/pipermail/binutils/2022-January/119273.html
However, no patch has been merged so far.
I don't have any preference, except that I prefer a common behavior of Binutils/GCC and LLVM.
I have a little preference for old format (cbo.zero a0
) but it's not a strong one.
Anyway, we have two options:
I prefer old one because:
cbo.*
instruction just like regular memory instructions.But again, this preference is not so strong.
- Because offset must be zero, we cannot handle
cbo.*
instruction just like regular memory instructions.- Because of 2., we will not have big advantage over the old one (as https://github.com/riscv/riscv-CMOs/issues/47#issuecomment-1024144629)
This is no different from the AMOs, which use parentheses.
@jrtc27 OK, you are right (I completely forgot about that). And I also misinterpreted the intention of @aswaterman's comment.
Anyway, I have no strong opinion about this and... sooner is better.
For what it's worth, this is a perfect example of why having toolchain patches be a requirement for ratification is a really good idea..
GNU Binutils adopted textual format '(rs1)' and '0(rs1)'. Textual format 'rs1' is explicitly rejected. https://sourceware.org/pipermail/binutils/2022-March/120111.html https://sourceware.org/git/?p=binutils-gdb.git;a=commitdiff;h=41d6ac5da655a2e78109848f2db47e53552fd61a
+/* Zicbom and Zicboz instructions. */
+{"cbo.clean", 0, INSN_CLASS_ZICBOM, "0(s)", MATCH_CBO_CLEAN, MASK_CBO_CLEAN, match_opcode, 0 },
+{"cbo.flush", 0, INSN_CLASS_ZICBOM, "0(s)", MATCH_CBO_FLUSH, MASK_CBO_FLUSH, match_opcode, 0 },
+{"cbo.inval", 0, INSN_CLASS_ZICBOM, "0(s)", MATCH_CBO_INVAL, MASK_CBO_INVAL, match_opcode, 0 },
+{"cbo.zero", 0, INSN_CLASS_ZICBOZ, "0(s)", MATCH_CBO_ZERO, MASK_CBO_ZERO, match_opcode, 0 },
"0(s)"
means proposed textual format (rs1)
and 0(rs1)
(as opposed to original textual operands "s"
).
To whom it may concern, (Cc: @asb @aswaterman @jrtc27 @nelson-chu @kito-cheng @dkruckemyer-ventana)
It should be a good time to make a resolution. Although that one of my patchsets (with paren, textual form as proposed by @asb as a LLVM patch) is merged into GNU Binutils' master branch, it should not be too late to revert it. But if we (RISC-V CMO authors and GNU/LLVM toolchain developers) don't coordinate now, we will risk making incompatibilities.
I admit that GNU Binutils has an option to support both, but I think it causes problems on long-term (because one of textual forms must be an alias). So, I designed two patchsets reject another textual form.
If we decide to change textual form, I propose new document version v1.0.1 to keep extension version (Zicbom1p0 and Zicboz1p0) as the new document will be still binary-compatible.
I’ll just briefly note that I’m happy with any decision we make here, as long as we’re consistent across the toolchains.
So am I.
I'm happy with (rs1)
and 0(rs1)
.
Both (GNU/LLVM) toolchain developers agreed. Remaining task: ask RISC-V CMO TG? tech-cmo@lists.riscv.org for request to change?
@a4lg I have no problem with the change, though I wonder whether the format should instead be inst offset(rs1)? In this case, the descriptive text would be written so that anything other than offset=0 is UNSPECIFIED.
I'm trying to anticipate a future extension that might introduce a non-zero offset, which would probably require a different instruction encoding but would have the same instruction semantics.
@dkruckemyer-ventana
TL;DR: offset(rs1)
looks good (for other reason than you provided).
I checked AMO instructions but they have no mention about textual format. If we need to mention about that, we need to invent something new.
I noticed that we need to explain value of offset since this is valid on GNU Binutils (master):
cbo.zero 2*3-6(t0)
In above example, offset is zero. However, it's not just a plain constant. But it's a constant expression that happened to be zero. So, explaining constraints to offset looks good to me. Also, this is valid, too.
cbo.zero (t0)
How about this?
Mnemonic::
cbo.zero offset(_base_)
...
Optional *offset* must be evaluated to zero (if exists). Otherwise, such encoding is UNSPECIFIED.
I'm not a native English speaker so proofreading is welcome.
Also, I'm not sure whether the word UNSPECIFIED is appropriate here (if nonzero offset variant doesn't exist).
Looks like some support for these landed in binutils as 41d6ac5da65 ("RISC-V: Cache management instructions")
. It's not in the release, so it's not technically a 100% stable interface yet, but we generally try to avoid reverting things. Sounds like there's mostly consensus here, so I just wanted to poke folks and see what's up -- don't want to force things here (and I don't personally care at all about the syntax).
Just wanted to let everyone know I haven't forgotten about this. Will try to get to this in the next week or so.
How can I implement the textual format of cbo.* asm operands in the gcc part? Not sure which style should I follow, Any suggestions?
This may be long past caring for y'all, but at one point ( as in, original proposal) we discussed having addressing modes for some CBOs that were likely to be used inside loops that did other computation, not just loops that were used to perform a CMO on an entire memory region. And in such loops there are advantages to having addressing modes comparable to those of the regular load and store instructions. This may no longer be the case - which IMHO will probably be the kiss of death for using CBOs to do cache management as loops progress in numerical code, the overhead of extra address computation often outweighs the benefit unless you are lucky enough to get away without any offsets - but I think that this possibility shows the advantage of having a uniform addressing mode notation, CBO.CLEAN (a1), for consistency in case there is ever created a CBO.XXX offset(a1) for some future XXX.
Note: (a1) if the addressing mode is registering direct only with no offset, i.e. M[a1]
I think that 0(a1) implies incorrectly that there would be an offset, and I would only support that encasing on zero offset is ever supported.
@a4lg I took your suggestion and modified it a bit:
cbo.zero _offset_(_base_)
(asciidoc formatting)
...
A cbo.zero instruction performs stores of zeros to the full set of bytes corresponding to the cache block whose effective address is specified in rs1. The instruction encoding does not include an immediate field, so offset is optional and shall evaluate to zero if present. An implementation may or may not update the entire set of bytes atomically.
Sound OK?
If the offset field does not evaluate to zero, is that an assembler error? I haven't checked but I suspect the instruction definition requires the offset field to be zero (i.e. non-zero offset encodings are reserved for future potential use). In other words, the offset field is not architecturally defined to be ignored and is not currently allowed to have a non-zero value.
Yes, in the current implementation, a non-zero offset of a cbo.* instruction triggers the assembler error "Error: illegal operands".
This is also tested as part of Binutil's internal regression test suite. Here is the link for the valid forms (according to the current implementation): https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/testsuite/gas/riscv/zicboz.s. And here is a link for the invalid forms (according to the current implementation): https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gas/testsuite/gas/riscv/zicboz-fail.s.
As I'm thinking about this some more, it's not clear what requirement needs to be captured in the ISA specs. A non-zero offset cannot be encoded in the instruction, but is there any ISA requirement that the mnemonic map to a single instruction in the executable? Or is that a toolchain convention?
For example, why can't cbo.zero 32(x3)
map to
addi x3,x3,32
cbo.zero (x3)
addi x3,x3,-32
?
We have both, instruction mnemonics and pseudoinstructions that might expand to multiple instructions (e.g. call
). However, pseudo instructions are optional (they simplify programming).
So yes, we need to restrict the mnemonic to express a valid encoding of a single instruction.
The pseudoinstructions are defined in the RISC-V ASM manual (also part of the RISC-V unpriv spec): https://github.com/riscv-non-isa/riscv-asm-manual/blob/master/riscv-asm.md#-a-listing-of-standard-risc-v-pseudoinstructions.
Any objections to the following?
A cbo.zero instruction performs stores of zeros to the full set of bytes corresponding to the cache block whose effective address is the base address specified in rs1. The offset operand may be omitted; otherwise, any expression that computes the offset shall evaluate to zero. An implementation may or may not update the entire set of bytes atomically.
Q: is the silent store optimization allowed?
(i.e. CBO.ZERO Mike do nothing if it knows that the "current" copy of the cache line is already zero.)
this might already have been discussed while I wasn't paying attention
moot if RISC-V Permits the silent store optimization for any store,
Sent from eM Client | www.emclient.com https://www.emclient.com/get
------ Original Message ------ From "David Kruckemyer" @.> To "riscv/riscv-CMOs" @.> Cc "Andy Glew" @.>; "Comment" @.> Date 5/9/2022 15:51:44 Subject Re: [riscv/riscv-CMOs] Textual format for cbo.* asm operands - '(rs1)' and '0(rs1)' vs 'rs1' (Issue #47)
Any objections to the following?
A cbo.zero instruction performs stores of zeros to the full set of bytes corresponding to the cache block whose effective address is the base address specified in rs1. The offset operand may be omitted; otherwise, any expression that computes the offset shall evaluate to zero. An implementation may or may not update the entire set of bytes atomically.
— Reply to this email directly, view it on GitHub https://github.com/riscv/riscv-CMOs/issues/47#issuecomment-1121659982, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACOGLJM6AJHEDAU5PHKBGI3VJGJIBANCNFSM5MBCOMXQ. You are receiving this because you commented.Message ID: @.***>
@dkruckemyer-ventana I think your proposal looks good. I will submit version 2 of #49 (by rebasing) later.
I have the updates in a local repo, so no need to make any changes as long as you agree with the wording.
Please see #51 for the latest. I'll merge and generate a pdf if this looks OK.
Merged and created cmobase-v1.0.1.pdf
I'm about to post patches to enable MC layer support for Zicbo{m,z,p} in LLVM. One point that isn't clear is whether the cbo.* instructions should be written like e.g.
cbo.clean (a0)
/cbo.clean 0(a0)
, matching the A extension instructions that take a memory address in a register with no immediate offset (e.g.lr.w t0, (t1)
), or just plaincbo.clean a0
. As @aswaterman pointed out on the binutils list, there's also precedent for not having parentheses (sfence.vma).Possibly this is outside the scope of the CMO group and something that those of us in the LLVM community just need to sync with the binutils/GCC folks on. What do you think?
CC @kito-cheng who's also been helping coordinate aligment between LLVM and GCC/binutils.