Open vonj opened 1 year ago
See https://lekkit.github.io/test/index.html and type doom X)
Although, that WASM demo doesn't have a JIT, and overall is disgustingly slow, so 'ere are the firmware/kernel/drive images to replicate that setup locally on a proper JITed native VM - please report if anything
rvvm fw_jump.bin -k linux_6.1 -i rootfs.ext2
rvvm_demo.zip
And if you wanna a GIF or something in a repository, then ugh... I considered it, but it would slow down page load times and overall I have a lot of better things to show off (Minecraft in RVVM? Plasma in RVVM?), but maybe you would have a better idea)
Also this 👀, although the mod has stagnated sinсe february 2022 for... well you might have an idea why
Any update?)
Tried on win11, no tweaking with -smp and -jitcache helps affecting doom performance, any reason why? what can be a bottleneck?
Tried on win11, no tweaking with -smp and -jitcache helps affecting doom performance, any reason why? what can be a bottleneck?
Option -jitcache
sets an upper bound on JIT cache size to prevent excess peak memory usage. When it reaches that point it has to drop some of the recompiled code, and if your working set doesn't fit in JIT cache it will trash and interpret-trace over the same code frequently.
There are efforts to determine optimal JIT cache size at runtime and scale it dynamically. For now if you don't have strict memory limits, just raise it. QEMU for comparison sets upper JIT cache bound to 1G and calls it a day X).
It should be noted some performance tuning might be possible for Windows. I haven't done extensive profiling there.
See src/vma_ops.c
for platform-specific code which is frequently used from JIT, if you're interested.
JIT also frequently calls vma_clean()
over dirty but unused memory to return it to OS untill it writes there again. Previosly there was a bottleneck, setting JIT cache size too high caused a stall in vma_clean()
on Linux, so I used a different API.
Tried setting -jitcache 64M
, -jitcache 16M
, -jitcache 4M
, -jitcache 1M
with that Doom demo - no perf/CPU usage difference on a Linux host. CPU use is at ~10% of one core, smooth as butter. Below 1M it starts stuttering, understandably so.
Win32-specific code might need profiling, will look at it soon.
I'm not even half kidding. :-)