iXit / wine-nine-standalone

Build Gallium Nine support on top of an existing WINE installation
GNU Lesser General Public License v2.1
272 stars 23 forks source link

Higher memory usage (reported by pmap) compared to wined3d in Mass Effect 2 #60

Closed Lahvuun closed 3 years ago

Lahvuun commented 4 years ago

wine-staging-4.18, mesa-19.2.1, llvm-9.0.0, gallium-nine-standalone-v0.5.0.356-release

32G RAM, Ryzen 1700, Vega 56

The game itself has high resolution texture mods installed, the binary is patched with LAA.

Upon exceeding 4G memory usage the game does one of the following:

I have a save game which I load to test this. Upon loading the memory usage is around 3.9G-4G with Nine. Sometimes it hangs right away, sometimes I need to run around for it to go over the limit. With wined3d the memory usage upon load and in that area is around 2.7G.

Interestingly, this issue is also somewhat reproducible with an apitrace: MassEffect2.trace

For me, when running d3dretrace.exe with Nine, it uses around 3.7G memory in game, while with wined3d it's around 2.1G.

I believe this may be related to #24, although there are multiple explicit mentions of video memory, which is not what this issue is about (I think), so I decided to create a new ticket. But it's possible OP of #24 is also running into the 4G limit, except the message is misleading. ME2 and the Borderlands games both use UE3, we both have installed higher resolution textures, and our issues are apparently absent when using wined3d.

axeldavy commented 4 years ago

I haven't had issues with Mass Effect 2 (only the 1), but it's interesting if an apitrace can show such a huge difference. Nine will always use a bit more than wined3d, because the code of wined3d is still mapped in memory in addition to the code of nine, but it doesn't take that much space. We'll take a look.

Lahvuun commented 4 years ago

Indeed, I also had virtual function errors with ME1 (which apparently happen because the game runs out of memory), but these were fixed by patching the binary with LAA. I also used Proton's wine-4.11, which included patches for forcing LAA.

I think the reason you haven't had issues with ME2 is because you did not install high resolution texture mods. Installing the ALOT mod is a PITA under Wine (at least it was until wine-4.18), so I suspect you never bothered with that. I tried using the same save game without any texture mods, and the memory usage with Nine is around 3G with LAA. With wined3d it's 2.3G, both well under the 4G limit which for whatever reason causes these issues.

axeldavy commented 4 years ago

Where playing with d3dretrace, I get about 1.6GB ram used with nine. 1.3GB with wined3d.

I looked at GTT and VRAM usage, and they don't explode either. requested-VRAM is 500MB with nine and 400MB with wined3d.

axeldavy commented 4 years ago

I ran with GALLIUM_HUD=fps,requested-VRAM+requested-GTT+mapped-VRAM+mapped-GTT,VRAM-usage+GTT-usage wine d3dretrace.exe

axeldavy commented 4 years ago

I use a 32 bits wineprefix, if that is related.

Lahvuun commented 4 years ago

Sorry it took me so long to answer, but I had to triple-check everything.

My bad, I should've been more explicit.

I'm not talking about RAM (that's the RES column if you use top, I believe?) or VRAM or GTT, I'm talking about virtual memory (the VIRT column). I'm sorry if I got the terminology incorrectly, but I'm fairly certain the crashes/hangs happen when VIRT nears 4G. This is why d3dretrace.exe doesn't crash for me either:

For me, when running d3dretrace.exe with Nine, it uses around 3.7G memory in game, while with wined3d it's around 2.1G.

By "memory" I meant the VIRT column, I believe pmap's total memory usage is the same value as VIRT, that's why I put pmap in the title.

When I run d3dretrace.exe with Nine, it doesn't near the 4G VIRT limit and so doesn't crash, but the game nears this limit in some areas and thus crashes/hangs/artifacts.

I actually tested this idea and bloated d3d9-nine.dll to ~500M:

diff --git a/d3d9-nine/d3d9_main.c b/d3d9-nine/d3d9_main.c
index 4be0e0c..9de7003 100644
--- a/d3d9-nine/d3d9_main.c
+++ b/d3d9-nine/d3d9_main.c
@@ -19,6 +19,8 @@
 static int D3DPERF_event_level = 0;
 static Display *gdi_display;

+char dummy[500000000] = {'a'};
+
 void WINAPI DebugSetMute(void)
 {
     /* nothing to do */

The d3dretrace.exe process crashed as soon as it got to 4G VIRT usage.

So, this is basically the problem: with Nine the game sometimes goes over the 4G VIRT limit and crashes/hangs/artifacts, with wined3d VIRT usage is much lower, usually under 3G, so it works normally.

I'm very sorry for the confusion.

axeldavy commented 4 years ago

For virtual memory we have to be careful about LAA.

Indeed without LAA, wine blocks allocations above 2GB by using that virtual space. So there should always be more than 2GB virtual space allocated. That shouldn't happen with LAA.

I assume you ran d3dretrace with LAA to get that 2.1GB usage with wined3d.

I'll investigate whether this 1.6GB difference could be due to a different behaviour of the wine allocator when nine is used, or if it is a problem in nine management of textures.

However if it were in Nine, then I would expect the VIRT usage difference to be a maximum of 500MB (requested VRAM+GTT) as I don't see what else it could be than mapped textures... There is not that many things that allocate virtual address space.

Lahvuun commented 4 years ago

d3dretrace.exe is compiled with LAA: https://github.com/apitrace/apitrace/blob/master/CMakeLists.txt#L381

I tried using wined3d and manually disabling the flag, to my surprise the retrace didn't crash in the menu, even though VIRT was over 3G. However it immediately crashed after the loading screen.

Seems like disabling the LAA flag causes VIRT usage to jump much higher, and it still crashes after 4G.

edit: yes, 2.1GB is with LAA, 3G+ without.

Lahvuun commented 4 years ago

Deleting (or renaming) the wine-preloader binary seems to lower VIRT usage. Goes from 3.7G with d3dretrace.exe to 3G. In-game goes from 4G+ to 3.5G. wined3d VIRT usage doesn't seem to change much from getting rid of wine-preloader.

mirh commented 4 years ago

So.. wined3d allegedly shouldn't have problems anymore have improved with https://github.com/wine-mirror/wine/commit/06909230c42e5e9533a2d75ef54cb91b0efc1ff4 (maybe also https://github.com/wine-mirror/wine/commit/a73a892f46c027e1fedab513795eac3b12ba568a), once you also fix pulseaudio I guess. Does Nine support per-slice compressed textures?

axeldavy commented 4 years ago

Sorry, but I fail to see how these should affect the bug reported here.

The handling of compressed textures is hidden in gallium, and I think it is per-slice.

axeldavy commented 4 years ago

Virtual memory usage should be reduced in mesa git. There was a leak with the nir path (which would explain why I had no issue with Mass Effect 2 years ago, TGSI was used by all drivers back then).

Lahvuun commented 4 years ago

I tested with https://gitlab.freedesktop.org/mesa/mesa/-/commit/0f2e44d55b01b3637fb96ce18840b8ab9250d508

I mentioned in the original message that I have a save game I use to test this.

Looks like memory usage upon load has gone down from 3.9G to 3.7G. I can progress through the area without issues with Gallium Nine. However, the memory usage still seems to increase very slowly. I haven't had issues or crashes so far, but I'm unable to play for an extended period of time at the moment.

There's still a pretty big gap between Gallium Nine and wined3d (with wined3d in the same situation using 2.4G). I understand this might simply be a limitation of Gallium Nine, so if you don't think there's anything else you can do about this, please feel free to close the issue.

which would explain why I had no issue with Mass Effect 2 years ago

I feel like I should again mention that I'm using a mod for higher resolution textures (https://www.nexusmods.com/masseffect2/mods/68). Without it, Gallium Nine doesn't really get near 4G usage in a reasonable amount of time. You playing without texture mods could also have contributed to having no issues in the past.

axeldavy commented 4 years ago

There are other areas where I have spotted possible gains.

I'm a bit surprised though by the gap with wined3d. Are you using another mod that compiles shaders ? Wined3d doesn't compile shaders before their first use (because GL) while we do, which can increase memory usage quite a lot (but is probably the correct thing to do). Opting out of that feature optionally is on my todo list.

mirh commented 4 years ago

(but is probably the correct thing to do)

Having some new way to speed up performance is nice and all, but is it actually the same behavior you get in Windows? I think that should make a right, not the might.

axeldavy commented 4 years ago

Well for Mass Effect 1&2 the shaders are almost taking no space. There aren't many of them.

However it seems that - the code is hard to read and undocumented so I may be wrong - Wine does unload from RAM MANAGED textures when they are uploaded in the VRAM.

The should indeed release some space (though seems against the spirit of MANAGED textures) and will induce some slowdowns if the texture is read (which is rare though), as it has to copy back the texture.

Lahvuun commented 4 years ago

I'm a bit surprised though by the gap with wined3d. Are you using another mod that compiles shaders ?

Not as far as I'm aware. My current installation is only using ALOT and ME2Recalibrated which only provides a couple of textures AFAIK. I also think I opted out of ENBs/SweetFX when installing ALOT, but it has been a while and I might be wrong.

Hopefully I'm able to reinstall soon, to have both the unmodded and modded versions so I can compare them better in regards to Gallium Nine vs wined3d.

dhewg commented 4 years ago

@Lahvuun did you manage to reinstall & recompare?

Lahvuun commented 3 years ago

Here are my results with https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377, using the same save I used for the original post:

mesa-21.0.0_rc5:

ninewinecfg -d ninewinecfg -e
without ALOT 2.25G 3.1G
with ALOT 2.7G 4G, hangs on load
mesa-f7dc0520d9045b424748855be103ee5affc53235: ninewinecfg -d ninewinecfg -e
without ALOT 2.25G 3G
with ALOT 2.7G 3.35G

There is one odd thing I noticed: when using wined3d with ALOT the VIRT usage is ~2.7G upon loading, but if I move further into the location it drops to ~2.3G and stays there, even if I return to the original spot and orientate the camera the same way. With nine enabled the usage is ~3.3G upon load and stays there even if I repeat the actions that reduce VIRT usage with wined3d.

edit: doing the same without ALOT the VIRT usage with wined3d does not change much from the initial ~2.3G

axeldavy commented 3 years ago

wine deletes the backing of MANAGED textures. Nine doesn't delete thrm, but instead stores them in memfd and unmaps the memfd files if the memfd usage is above a threshold (set to 512MB). The advantage is that the data is fast to access if needs be. And Mass Effect 2 is one example of game reading the data (thus it should feel smoother with Nine than wined3d when triggering high res texture uploads with ALOT).

You can unmap more aggressively (accessing the data will be 10 times slower, but it's 10 times slower than something extremely fast, so still much faster than reading the content in the GPU memory like wined3d). the env var/drirc option "texture_memory_limit" controls that. By default texture_memory_limit=512, that is when you have more than 512MB of textures in memfd you start unmapping. With texture_memory_limit=0 You should unmap right away when the data is not needed. I don't recommand that value as often newly allocated data is needed right away. So maybe texture_memory_limit=64 would give what you want.

Lahvuun commented 3 years ago

Indeed, with texture_memory_limit=64 my usage is down to 2.9G. Amazing!

Thank you for taking the time to fix this!

axeldavy commented 3 years ago

Thanks, I hope you will enjoy ME2 with ALOT and nine.

Also a few other env vars/drirc options you can try: tearfree_discard=true If your window manager doesn't do compositing this will remove tearing thread_submit=true can reduce input lag