tonioni / WinUAE

WinUAE Amiga emulator
http://www.winuae.net/
528 stars 86 forks source link

64-Bit Version performs worse than 32-Bit on x64 Windows 10 XPS13 Laptop #210

Open cwsoft opened 2 years ago

cwsoft commented 2 years ago

Performance of latest 4910 64-bit Version is much worse compared to the 32-bit Version on my 64-bit Dell XPS13 Laptop Windows 10 in latest release. Tested on A500 Kick 1.3 and A1200 Kick 3.1 and various Setups like AmigaOS 3.2.1, AmiKit 11.5, Scalos WB etc. Boot up time and screen updates are much slower than with identical setup and 32-bit WinUAE.

Tested without any controller or external mouse etc. No glue whats going on here.

tonioni commented 2 years ago

Did older versions have similar slower speed? There shouldn't be any 64-bit vs 32-bit differences except JIT.

cwsoft commented 2 years ago

All my previous versions were 32-bit only. So I can only speak for the last 4910 where I switched to 32-bit due to much faster boot-up time e.g. in AmiKit, Pimiga2 (on WinUAE) or just 1.3 or 3.2 AmigaOS workbench. Could provide logfiles or other information if needed to check this out.

tonioni commented 2 years ago

Could you confirm if 4.9.0 and 4.4 has same problem? Download zip package and temporarily overwrite winuae.exe, there is no need to install anything.

cwsoft commented 2 years ago

Hello Toni. Sorry for the late replay, missed the notifications. Yepp, can confirm that all tested WinUAE 64-versions (v4.4.0, 4.9.0, 4.9.1) are much slower when it comes to loading the Workbench files. Some measurements on booting up AmiKitXE 11.7.0 (WB 45.3, KS: 40.68, Amiga 1200) gives the following times until the WB is fully loaded:

So basically all setups on my Win64 / XPS13 Laptop took about 20s longer when booting up distros like AmiKit, Pimiga, or my custom Workbench setups.

tonioni commented 2 years ago

Ok, so it isn't some new problem. Does same 32-bit vs 64-bit difference also happen without JIT? (If you used JIT originally), Same difference also in windowed vs fullscreen? (Make sure vsync is not enabled)

cwsoft commented 2 years ago

Hello Toni, have done some more tests. Previous tests were all with JIT enabled and booting into Full-Window. Timings below are for booting-up AmiKit 11.7.0 (WB 45.3, KS: 40.68, Amiga 1200) as in previous tests. Added

WinUAE 4.9.1 (both 32/64 bit variants):

For reference only (Amiberry v5.2 64-bit on a Pi400)

So disabling JIT shows about the same boot-up time for 32/64 bit regardless of the screen mode. With JIT enabled, the 64-bit version is about 20s slower than 32-bit regardless of the screen mode used.

Vsync settings (used for all posted tests with timings): gfx_vsync=false gfx_vsyncmode=normal gfx_vsync_picasso=false gfx_vsyncmode_picasso=norma

P.S.: Same performance issues found e.g. when booting up Pimiga2 or my custom AmigaOS 3.2.1 Workbench, so the performance issues are not related to AmiKit but seem to be "generic" for all my Workbench setups tested so far. All my setups are using an A1200 with JIT enabled as default and Full-Windows screen mode.

Attached my regular AmiKitXE.uae file (JIT enabled, Full-Window) too. AmiKitXE.uae.txt

tonioni commented 2 years ago

Thanks. It probably means 64-bit has always been "slow" in this situation. I am not sure if I can fix it because I am not that familiar with JIT (Also 64-bit variant was originally implemented by Aranym developers).

What if you change JIT cache size? (try both smaller and larger if it isn't already set to largest) 64-bit generated code is usually larger than 32-bit which could overflow JIT cache more easily and/or it might not anymore fit in host CPU caches which would also cause noticeable slowdown if 32-bit generated code still fits.

cwsoft commented 2 years ago

JIT cache size was set to max. for all my tests already. Fun fact somehow is that Amiberry v5.2-64-bit (which uses quite some WinUAE code) on my Pi400 with JIT enabled boots-up about as fast as the WinUAE 32-bit version on my Win10 Dell XPS Laptop.

Do you know if Midwan (author of Amiberry) implemented JIT differently?

midwan commented 2 years ago

Amiberry uses a different implementation of JIT, since it's ARM-based. The JIT code comes from TomB originally (https://github.com/PandTomB/uae4arm), though I might make some minor changes here and there based on the updates coming from WinUAE.

cwsoft commented 2 years ago

Midwan thanks for clarification. So guess there won‘t be a fast solution for WinUAE with JIT for 64-bit Windows machines. Guess it won‘t be a big deal for most emulation scenarios if you run the emulator in 32/64 bit. So I will stick to the WinUAE 32-bit version on Windows.

tonioni commented 2 years ago

Did you try smaller JIT cache? It can be faster in some situations (as I said in previous post) I assume CPU benchmarks returns similar results? (at least I get faster results in AIBB or sysinfo). To confirm that JIT does work as expected at least in situations where there is nothing else to do (like HD accesses).

cwsoft commented 2 years ago

Hi Toni, did try min/max previously and timing changed marginally. Tried cache size of 4 MB instead of 16 MB (max) now, which made a huge difference. Startup-time for 32-bit is now 12s (before 14,7s) and 64-bit time reduced to 13,6s (before 37,1s with max. cache of 16 MB). So there is indeed an influence of the JIT cache size and my seen performance issues between 32/64 versions of WinUAE. Thanks for pointing me into the right direction.

tonioni commented 2 years ago

This is quite interesting. What is your laptop CPU model? (full model information). I suspect it is CPU cache related.

cwsoft commented 2 years ago

Hi Toni. My Laptop CPU model is Intel(R) Core(TM) i5-6200U CPU @ 2.30GHz 2.40 GHz, 2 Cores, 4 logigal CPUs. Further details can be found in the attached dump-file of CPU-Z. Just let me know if you need additional infos in case you want to track down the issue any further.

dellxps13.txt