FEX-Emu / FEX

A fast usermode x86 and x86-64 emulator for Arm64 Linux
https://fex-emu.com
MIT License
2.36k stars 123 forks source link

Generate SVE for 80bit stores when possible #4166

Open pmatos opened 6 days ago

pmatos commented 6 days ago

Fixes #4126

pmatos commented 6 days ago

Probably not a big win in practice. A single 80-bit store required 2 stores (64 + 16). Now, we require three instructions: mov + whilelt + st1b. You need three 80-bit stores in a block to get to a draw instruction-wise. The next step would be not to assemble the predicate register every time, which I will do next, but we'll still require in practice at least three stores per block for it to "win" instruction-wise.

pmatos commented 6 days ago

Converting to draft, so it's not merged by mistake.