bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.41k stars 1.3k forks source link

Cranelift: simplify opcode set by removing `_imm` variants #3250

Open cfallin opened 3 years ago

cfallin commented 3 years ago

In various discussions, we have come to the conclusion that "combo ops" generally cost more than they are worth. When one CLIF opcode simply expresses the combination of two other opcodes, it (i) expands the set of opcodes that all consumers of CLIF must handle, but (ii) adds minimal value, because one can pattern-match if one needs to handle the combination specially.

So far we have not really discussed the op_imm opcodes (e.g., iadd_imm and isub_imm) in this context. They are currently converted in the "simple legalization" pass used by new backends into iconst + op, so the backends do not need to actually handle them; but this separate pass is awkward and shouldn't be necessary.

Instead, it might be better to remove the combo opcodes, but provide backward compatibility (and convenience) to producers by adding combination methods to the instruction builder that generate the two opcodes. So InstBuilder::iadd_imm would generate an iconst and an iadd, for example.

This would require some work in the meta crate but is probably feasible. The main downside is that the CLIF becomes slightly more inflated earlier in the pipeline, but we expand it before lowering anyway, so it may actually be better to generate it in the final form and avoid the edit.

cc @abrown @afonso360 @bjorn3 from earlier discussion

afonso360 commented 3 years ago

One of the arguments that @bjorn3 mentioned is that, the _imm forms are much more readable in IR textual form. I agree with him.

However, I also think that if we do split the op with the builder, the readability impact is somewhat minimized since the const is directly above the op, and the reader doesn't have to look too far for the value.

E.g:

v123 = iconst.i32 0xFFFF_0000
v124 = iadd.i32 v0, v123

Another thing to consider is that we use a iadd_imm like symbol to denote offsets in global values. This may become confusing if iadd_imm is no longer an instruction.

See: https://github.com/bytecodealliance/wasmtime/blob/0771abf2103f9c5c07c7b7892b555c8ad1165a43/cranelift/filetests/filetests/runtests/heap.clif#L156-L159

bjorn3 commented 3 years ago

However, I also think that if we do split the op with the builder, the readability impact is somewhat minimized since the const is directly above the op, and the reader doesn't have to look too far for the value.

This likely won't be true after optimizations like GVN.

fitzgen commented 3 years ago

I like this proposal.

I generally like the idea of having clif be very riscy and full of micro-ops and then letting isel lowering choose the appropriate macro-op for the target arch. This change seems inline with that.

Another thing to consider is that we use a iadd_imm like symbol to denote offsets in global values. This may become confusing if iadd_imm is no longer an instruction.

Good catch. If we removed _imm-suffixed instructions, it would probably make sense to rename this from iadd_imm to offset_of_global or something.

However, I also think that if we do split the op with the builder, the readability impact is somewhat minimized since the const is directly above the op, and the reader doesn't have to look too far for the value.

Or we could go the other direction and allow every operand to every instruction to be either an ssa value or an inline constant...

cfallin commented 3 years ago

Or we could go the other direction and allow every operand to every instruction to be either an ssa value or an inline constant...

Very out-of-the-box and I like it! I think I've seen a JIT engine that worked like this (I forget where?). It would be a pretty deep change throughout the compiler -- everything operates on Values now -- but something we could consider if we ever have a wider design-reconsideration phase for CLIF.

sunfishcode commented 3 years ago

One option would be to keep using Value, and add an Immediate arm to ValueDef.

sunfishcode commented 3 years ago

Two random thoughts:

One of the original motivations for the _imm instructions was compile time -- doing a very simple pattern-match early on in the compiler reduces the number of instructions that anything else in the compiler that visits every instruction has to visit.

If _imm instructions are removed, one option to recover the readability would be to introduce infix notation in the clif syntax for single-use arithmetic and constant instructions. v4 = v3 * v2 + 5 is much easier to read at a glance than

   v6 = imul v3, v2
   v4 = iadd_imm, v6, 5

and one can immediately tell the multiply is single-use, without scanning the rest of the function.