ptitSeb / box64

Box64 - Linux Userspace x86_64 Emulator with a twist, targeted at ARM64 Linux devices
https://box86.org
MIT License
3.9k stars 286 forks source link

[RV64_DYNAREC] Added 66 0F D4 PADDQ opcode for vector and fixes SEW cache transform #1812

Closed ksco closed 2 months ago

ptitSeb commented 2 months ago

Why is the ordering of the sewTranform maters?

ksco commented 2 months ago

Why is the ordering of the sewTranform maters?

Because it changes the runtime effective sew of the vector unit, but do not change our compile time tracking of sew for current instruction.

So for example, if the current compile time sew is 64, and it changed to 32 in the sew transformation, if there is a VLE instruction in the fpu transformation it will be VLE64 as the compile time sew is 64, so the instruction will SIGILL because it runtime sew has already changed to 32. (It may seems strange, but VLE64 requires current sew to be 64.)

ptitSeb commented 2 months ago

Why is the ordering of the sewTranform maters?

Because it changes the runtime effective sew of the vector unit, but do not change our compile time tracking of sew for current instruction.

So for example, if the current compile time sew is 64, and it changed to 32 in the sew transformation, if there is a VLE instruction in the fpu transformation it will be VLE64 as the compile time sew is 64, so the instruction will SIGILL because it runtime sew has already changed to 32. (It may seems strange, but VLE64 requires current sew to be 64.)

Ok, but what happens if the sew needs change in fpuTransform but sewTransform is not needed?

ksco commented 2 months ago

Why is the ordering of the sewTranform maters?

Because it changes the runtime effective sew of the vector unit, but do not change our compile time tracking of sew for current instruction. So for example, if the current compile time sew is 64, and it changed to 32 in the sew transformation, if there is a VLE instruction in the fpu transformation it will be VLE64 as the compile time sew is 64, so the instruction will SIGILL because it runtime sew has already changed to 32. (It may seems strange, but VLE64 requires current sew to be 64.)

Ok, but what happens if the sew needs change in fpuTransform but sewTransform is not needed?

The fpuTransform is and will always be sew agnostic, meaning any valid sew is okay for it, so there will never be any sew changes in fpuTransform.

ptitSeb commented 2 months ago

This seems a bit odd to me, but that's ok. a SIGILL is easy enough to catch and diagnose if something goes wrong later.

ksco commented 2 months ago

This seems a bit odd to me, but that's ok.

Things in the fpuCacheTransform are basically moves, loads and stores, so it’s natural that it’s not bound to a specific SEW, which is also good for performance.

ptitSeb commented 2 months ago

This seems a bit odd to me, but that's ok.

Things in the fpuCacheTransform are basically moves, loads and stores, so it’s natural that it’s not bound to a specific SEW, which is also good for performance.

But yet, it still needs sewTranform to be done after and not before?

ksco commented 2 months ago

I guess we can “fix” this by actually change the compile time sew in sewTransform in and only in pass3.

ptitSeb commented 2 months ago

that's not really my point. I'm just woried that this "need" is just the manifastation of an issue where some needed sew tranform is needed and missed...

ksco commented 2 months ago

ah ok, got it.