bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.09k stars 1.26k forks source link

`Inst` abstraction in x64 backend #2252

Open abrown opened 3 years ago

abrown commented 3 years ago

As mentioned by @cfallin in https://github.com/bytecodealliance/wasmtime/pull/2248#discussion_r497853577, there is a potential abstraction leak re: Inst. The way I see it, Inst was originally designed to have variants that abstracted common classes of instructions (e.g. div) but eventually grew variants that more closely matched x64 encoding formats (e.g. unary_rm_r). Now Inst contains both kinds--abstract instructions and encoding formats--and this causes confusion (e.g. which format am I supposed to use here? Or is there an instruction that covers this?) and potential bugs (e.g. handing an opcode to a format that should never encode that opcode as @bnjbvr has pointed out). This problem will only be exacerbated by adding other types of encoding formats like VEX and EVEX.

I see several possible directions:

Thoughts?

abrown commented 3 years ago

cc: @jlb6740, @julian-seward1

bnjbvr commented 3 years ago

Yeah, I think a refactoring there is overdue. In my opinion, the way forward would be to have different Inst each time we need to, that is when:

If an Vcode Inst doesn't have meta semantics, and there's another inst with the same format and input/output register classes, then a new Inst is not required.

Re: naming. I followed the initial choices from @julian-seward1 to have the Inst be named after their operands format, but we should rethink about this, and follow aarch64's example here of naming the instructions after what they do, not how they're defined at a low-level.

Re: the possibility of creating impossible inst: the SseOpcode enum is useful but too wide; instead we should have a subset of SseOpcodes for each Inst ctor, so that it becomes a compile error instead of a dynamic panic (or worse, a silent runtime error) to use the wrong opcode for a given inst. We can either have different enums that convert down to SseOpcode, or just one enum per Inst that's handled down to the bottom. I think the latter is preferable, since having enums that convert to SseOpcode would still make it possible to miss cases when emitting the machine instructions.

Re: ergonomics. We should go down the road of higher level assembly instructions, like it's been done for Inst::load/store etc, and use this pattern more. Maybe not all the time, but it's immensely useful to do so when the instruction is being constructed many times in many places, in my opinion.

bjorn3 commented 3 years ago

Was this done?

cfallin commented 3 years ago

I think we still could potentially clean up / orthogonalize things -- we haven't directly addressed this yet, AFAIK.