riscv / riscv-bitmanip

Working draft of the proposed RISC-V Bitmanipulation extension
https://jira.riscv.org/browse/RVG-122
Creative Commons Attribution 4.0 International
204 stars 66 forks source link

cmov #185

Open ryao opened 1 year ago

ryao commented 1 year ago

I saw cmov in the 0.90 draft, but it is absent from recent versions. I read something a while ago about it being unnecessary since a decoder can implement it whenever a conditional jump to the instruction after a mov that follows it is done. This is very unkind to people reviewing compiler output.

If we do not get an actual cmov instruction, could we at least get a pseudo-instruction that the assembler will translate into a conditional jump and mov to make it easier for humans to review the assembly generated by compilers? I imagine the pseudo-instruction would need to be like a x86 cmov rather than the cmov in the draft to avoid forcing assemblers to do register allocations.

Recently, I have been micro-optimizing a binary search to operate on 4KB sized arrays or smaller. Making unpredictable branches use predication is very important for binary search performance. On both of x86 and arm, it is easy to spot where the actual branches are thanks to cmov/csel removing jump instructions from the output when predication is used. On RISC-V, LLVM emits a conditional jump to the instruction after the mov that follows it, presumably to take advantage of the predication support in instruction decoders. However, this makes spotting the remaining branches more difficult.

topperc commented 1 year ago

There might be a bit of a combinatorial explosion of pseudoinstructions.

6 32-bit branch instructions + 2 126-bit compressed branch instructions. Plus the bgt(u)/bge(u) aliases. Plus the aliases for one of the operands being x0. 1 32-bit reg to reg mv encoding, 1 16-bit reg to reg mv encoding. Do we include the 2 immediate to reg mv encodings?

sifive-7-series can predicate most ALU instructions and LLVM knows that for some instructions and will use it.

ryao commented 1 year ago

There might be a bit of a combinatorial explosion of pseudoinstructions.

6 32-bit branch instructions + 2 126-bit compressed branch instructions. Plus the bgt(u)/bge(u) aliases. Plus the aliases for one of the operands being x0. 1 32-bit reg to reg mv encoding, 1 16-bit reg to reg mv encoding. Do we include the 2 immediate to reg mv encodings?

sifive-7-series can predicate most ALU instructions and LLVM knows that for some instructions and will use it.

ARM’s conditional instructions can take a cond argument that can be any of EQ, NE, CS|HS, CC|LO, MI, PL, VS, VC, HI, LS, GE, LT, GT, LE, AL or NV. Perhaps the same trick could be used to avoid the combinatorial explosion of pseudo-instructions.

As for the immediate to reg mov encodings, I have two thoughts regarding this. As a developer, I would prefer to have those included in the set of pseudo instructions so that compilers will generate easier to review output. However, if it would be rarely used by compilers, we could get away without it.

Findecanor commented 1 year ago

My guess is that cmov got dropped because it had four operands.

There are two crude conditional-set-zero instructions in the Zicond extension. It is about to get ratified. A single conditional move is three instructions however, with one temporary.

Conditional add/subtract/and/or/xor are only two instructions though. Many times an expression with a cmov in it is actually expressing conditional arithmetic, so it would be worthwhile to detect that and use the simpler instruction sequence whenever possible.