Open roybaer opened 3 years ago
Interesting read.
In the meantime I have been able to rebase HQEMU's LLVM patches onto more recent LLVM versions with some manual intervention. The full LLVM build process succeeds for the patched versions 7, 9, 10 and 11, while version 8 fails to build for unrelated reasons. A successful build obviously does not mean that it still works, but I cannot really test it right now, because I do not have the relevant AArch64 hardware handy.
When it comes to HQEMU's additions and modifications to QEMU, it is probably easier to manually reapply them to a new QEMU.
Could you please try to apply the qemu changes onto our qemu?
I can try, but it's going to take a while. Right now, HQEMU does not even compile with the updated patched LLVM, because of API changes. If we rely on LLVM 6, only, the changes to the QEMU code base still amount to 2454 insertions and 331 deletions, not counting newly added files. We'd have to see how much QEMU has changed from version 2.5 to version 5.
One conceptual problem with optimizing the generated ARM code is exception handling: It is difficult to impossible to merge two x86 instructions into one ARM instruction (or any other less-than-1:1 matching). If there's an exception in an ARM instruction that doesn't clearly match an x86 instruction qemu can't properly report the exception location.
I don't know if HQEMU attempts to do a n:m optimization or if it attempts to do anything about signal handling in this case.
I somehow doubt that LLVM's optimizer is going to pay any attention to that. It's probably going to be the typical speed vs. accuracy trade-off. I get the impression, though, that the byte-exact location of an exception only really matters in combination with anti-debugger code. HQEMU is apparently at least good enough to run Windows XP in full system emulation mode and the speedup is very desirable.
I somehow doubt that LLVM's optimizer is going to pay any attention to that. It's probably going to be the typical speed vs. accuracy trade-off. I get the impression, though, that the byte-exact location of an exception only really matters in combination with anti-debugger code. HQEMU is apparently at least good enough to run Windows XP in full system emulation mode and the speedup is very desirable.
This could be highly problematic. A strong driving force behind WINE these days seems to be VALVe's Proton fork and it's use in gaming on Linux, which has been quite technically successful. The games on their Steam platform were produced by varoius publishers for windows, often several years ago. Many of them contain a large number of DRM measures over which VALVe has no control. If the emulation of x86 isn't accurate enough, particularly against anti-debugger code, then it would block the emulation of these games on non-x86 platforms.
I could see VALVe wanting to pursue this in the future (they have supposedly been working on a Nintendo Switch competitor, but have been forced to use a less power-efficient x86 mobile chip from AMD instead of an ARM chip from NVIDIA) so some future way of mitigating this is probably worth consideration.
Perhaps in the future regular checkpointing could be employed and more instruction-accurate emulation selected to roll forwards in the event of a (rare) exception?
Given that Hangover's performance is reportedly mostly limited by QEMU, I would like to ask whether you have heard of HQEMU.
To quote from HQEMU's webpage:
I have not had a chance to try it out myself but the description sounds promising.