Closed pizuz closed 2 years ago
Unfortunately, the CI artifacts from back then have expired already. Do you keep them around somewhere?
Unfortunately, we don't separately archive the CI artifacts from each PR. 🙁
That‘s a shame. I will try to bisect myself, but it will probably take me a while because of some time constraints on my end. The regression was definitely introduced somewhere inbetween those versions.
There's a good chance this is the same issue reported in #1628, which looks like it was introduced in 4371ef4d2b706d761acac49c1b7f9d413d0d15db, conveniently between 1.1.2 and 1.1.3
There's a good chance this is the same issue reported in #1628, which looks like it was introduced in 4371ef4, conveniently between 1.1.2 and 1.1.3
PR #1676, which fixes #1628, may improve this. Please retest with latest MoltenVK and close this issue if performance is improved.
I'm seeing some performance benefit, but it is still quite a bit slower than MVK 1.1.2. Guess I have to live with that. Thanks for the work.
still quite a bit slower than MVK 1.1.2
If you can get further info on where is it slowing down, we can try to address it.
I‘ll try to profile it over the weekend.
I made a few performance traces over at the Dolphin thread, if that helps narrowing it down. Looks like the remaining performance hit is less than I initially thought.
https://github.com/dolphin-emu/dolphin/pull/9981#issuecomment-1214378319
The traces posted look very similar to an issue I was having with MVK on Nvidia with PCSX2
Looks like the issue was related to the semaphore emulation that's used by default on Nvidia now (which AIUI is due to an Nvidia driver bug and required for correctness). Setting MVK_ALLOW_METAL_FENCES
brings the speed back to normal, though a number of previous MVK commits have changed the speed through the history since 1.1.2.
Test: Run NFSCarbon.gs.xz.zip with PCSX2 v1.7.3212. Set blending emulation to minimum in the graphics settings to ensure a CPU bottleneck. (You'll have to unzip the file, but PCSX2 can read .gs.xz
directly. For Qt, drag the file onto the main window. For wx, place it in ~/Library/Application Support/PCSX2/snaps
and open the GS debugger from the debug menu.)
1.1.2: 65fps
4371ef4d2b706d761acac49c1b7f9d413d0d15db: 75fps
2ef21c65bf940d82577453ab24d08ddaef49cfe9: 85fps
Somewhere between f8280bca8933ecc839a4e35ba904d35a70430962 and aa89f845a994f71491f2713e1a0137317d465fbc: 75fps
After the commit that switched on semaphore emulation on Nvidia: 60fps but 75fps with MVK_ALLOW_METAL_FENCES
enabled
We're still faster than 1.1.2 (and any single release for that matter, as the improvement and drop are both between 1.1.2 and 1.1.3), so not a huge deal, but there definitely was a time that was faster than now.
Hi,
I noticed a quite substantial performance regression starting either in 1.1.3 or 1.1.4 when using MVK in the Dolphin emulator. v1.1.3 refuses to load, therefore there‘s a lot more to bisect. Unfortunately, the CI artifacts from back then have expired already. Do you keep them around somewhere? Building every single PR myself is possible, but quite a pain.
My setup: macOS 10.15.7 on a Late 2013 iMac (i7 with a nVidia GeForce GT 750M)
Regards, Pizuz