Closed brucehoult closed 1 year ago
Hi Bruce,
Zcb has simple code-size saving instructions which are easy to implement on all CPUs.
is already there. A conscious choice was made to overlap Zcmt/Zcmp encodings so that they can't be part of the architecture profile.
It's reasonable to add a line of text to Zcmt/Zcmp saying that they are intended for small embedded CPUs only. However, that doesn't prevent others from implementing them in higher performance CPUs. The choice is there, it just won't comply with the architecture profile.
The point is that without the clarification some people making high performance cores are getting a false impression that they HAVE to implement Zcmt and Zcmp somehow (especially prior to RVA22 very recently entering the public comment phase). Other people writing software are concerned that the C instructions for FP load/store may be being phased out, creating software incompatibility.
These concerns are, as we know, false, but there is nothing in the document preventing these false impressions.
see https://github.com/riscv/riscv-code-size-reduction/pull/205 I can't add you as a reviewer
Since #205 was merged, this can be closed.
The spec needs some commentary on the class of implementations it is targeted at.
On the mailing list, reddit, telegram groups the majority of the messages I'm seeing are from people with concerns such as "How am I supposed to implement this in my 8-wide M1-class server processor? It's going to blow up my reorder buffer / TLB / exception handling" or "The re-purposing of compressed FP load/store encodings will cause incompatibility with existing RV64GC programs" or "JavaScript performance is heavily dependent on double precision FP and losing compressed load/store will adversely affect cache utilisation and performance"
My understanding has always been that this extension is targeted at embedded CPUs that are probably single-issue in-order and don't have an FPU or MMU in the first place.
But currently nothing at all is said about target market / implementations.
I asked on the mailing list and Krste replied:
I think that is specifically referring to Zcmp (expands to lots of µops) and Zcmt (uses MMU twice).
Zcb doesn't seem problematic and in fact seems as if it would be useful everywhere, but I'm unsure of the intentions for it re RVA.