ptitSeb / box64

Box64 - Linux Userspace x86_64 Emulator with a twist, targeted at ARM64 Linux devices
https://box86.org
MIT License
3.9k stars 287 forks source link

Is there a plan to introduce a tracing jit compiler to optimize the jit code? #1358

Open jhe33 opened 8 months ago

jhe33 commented 8 months ago

Hi ptitseb

Has box64 a plan to dev a complete tracing jit compiler? so that it can eliminate the redundancy brought by X86 code gen.

current implementation could improve the code layout, but a tracing jit compiler could focus the compilation on hot code, like a tracing, single entry multiple exit, and more optimization passes could be added easily to eliminate the redundancy during code generation, e.g. constant folding, propagation, CSE etc. thank you.

Jie He

ksco commented 8 months ago

If all the emitted code is messed around by the optimizer, it's difficult to do accurate signal handling -- how do we know the exact x86 register status in the guest sighandler when a signal is delivered?

But to my understanding, a peephole optimizer with accurate signal handling in mind is possible.

jhe33 commented 8 months ago

If all the emitted code is messed around by the optimizer, it's difficult to do accurate signal handling -- how do we know the exact x86 register status in the guest sighandler when a signal is delivered?

But to my understanding, a peephole optimizer with accurate signal handling in mind is possible.

I think there is a trade-off for this situation, for some cases which doesn't care signal handling precision, user could get a performance improvement due to compiler optimization; otherwise one can disable the optimizations.

xctan commented 8 months ago

The x86 cache behavior is another tricky thing. Box64 is designed to handle JIT inside Box64 JIT correctly. The x86 hardware automatically invalidates (instruction) cache line when corresponding memory location is written to without an explicit flush or fence instruction, which Box64 is trying to emulate. Therefore, Box64 protects the x86 code memory pages when generating its JIT code, so a SIGSEGV would be triggered when the x86 code mutates. This signal would be intercepted by Box64, and the JIT cache of the entire memory page where the signal emits will be marked as dirty. Current practice of generating one JIT code block for one continuous x86 code block leads to simpler integrity check.

jhe33 commented 8 months ago

The x86 cache behavior is another tricky thing. Box64 is designed to handle JIT inside Box64 JIT correctly. The x86 hardware automatically invalidates (instruction) cache line when corresponding memory location is written to without an explicit flush or fence instruction, which Box64 is trying to emulate. Therefore, Box64 protects the x86 code memory pages when generating its JIT code, so a SIGSEGV would be triggered when the x86 code mutates. This signal would be intercepted by Box64, and the JIT cache of the entire memory page where the signal emits will be marked as dirty. Current practice of generating one JIT code block for one continuous x86 code block leads to simpler integrity check.

I think it's a common technique for SMC, in most DBT solutions, they also take it to handle SMC unless the hardware provides some supports, e.g. mmu could notify the BT when data cache changes to instr cache.