Open jonashaag opened 1 year ago
Are you using the linear memory optimization? It should be enabled on most platforms by default, unless you're disabling it by passing the '-m' flag. If I'm running on Linux x86-64 then I have a near certain chance of getting fast memory. Profiling will look like this:
But if I pass the -m
flag to blink to disable the linear memory optimization, then profiling will look the way yours does:
Sorry, should have specified the exact command. Yes, I'm using -m
because other the program won't work.
Have you read these sections of the readme?
The reason why -m
is costly is because it does full memory virtualization. It has to indirect memory accesses through a translation lookaside buffer and a four-level radix trie. It's about as optimized as it can be.
The best bet for you would probably be to find some way to get the linear memory optimization working for you. For example, we could find some other formula for mapping guest addresses onto host addresses.
You're also invited to join our Discord https://discord.gg/Hb4QHYj2
Not sure if this is expected, and if anyone is interested in optimizing this.
I have a real world workload that spends a lot of time in the memory subsystem. macOS Instruments profile:
Unfortunately I can't share the workload itself but I can do more profiling or try patching some stuff. I've already figured out that caching some of the machine-related checks (
if (m->foobar ...)
) speeds up things by 10%