doitsujin commented 5 years ago

For some reason it looks like DXVK's device memory allocation strategy does not work reliably on Nvidia GPUs. This leads to game crashes with the characteristic DxvkMemoryAllocator: Memory allocation failed error in the log files.

This issue has been reported in the following games:

1099 (Bloodstained: Ritual of the Moon)

1087 (World of Warcraft)

If you run into this problem, please do not open a new issue. Instead, post a comment here, including the full DXVK logs, your hardware and driver information, and information about the game you're having problems with.

Update: Please check https://github.com/doitsujin/dxvk/issues/1100#issuecomment-509484527 for further information on how to get useful debugging info. Update 2: Please also see https://github.com/doitsujin/dxvk/issues/1100#issuecomment-515083534. Update 3: Please update to driver version 440.59.

Sandok4n commented 5 years ago

The same error with World of Tanks. Hardw.: 6700k 16GB GTX 780 3GB when i set and without __GL_SHADER_DISK_CACHE_PATH=~/.nv Nvidia doesn't allocate cache. Maybe here is issue? GPU drivers: 418.52.10 or 430.26

doitsujin commented 5 years ago

@Sandok4n

post a comment here, including the full DXVK logs

The shader cache should have nothing to do with memory allocation issues.

Sandok4n commented 5 years ago

Ok. I'll try to reproduce this error but when it appeared I've downgraded kernel, dxvk and wine. Problem were that same. Only one thing was not changed. NV drivers (installation version in AUR was only new). Problem is for about two weeks.

SveSop commented 5 years ago

The shader cache should have nothing to do with memory allocation issues.

Maybe not, but as i have pointed out in another thread "dirty shader cache", it seems to me i have fewer crashes with a fresh .nv cache (delete the GLCache folder AND the WoW/Retail/Cache folder). If i keep clearing it regularly the crashes is less, but more stuttering at the start. Crashing while zoning COULD perhaps mean something weird happens when DXVK shader compilation is done?

I assume that the shader compilation business with WoW goes something in the lines of: WoW (Cache .WDB) -> DXVK -> .nv (driver cache)? Could the WoW cache folder contain some weird shaders that DXVK uses too much memory to compile/read somehow?

pchome commented 5 years ago

https://github.com/Joshua-Ashton/d9vk/issues/170 - possibly connected.

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory. https://gist.github.com/pchome/fb43b3752b878501757bdad571473a4e - mem data during such crash (from D9VK issue 170).

#103 - I was happy with this fix, some "heavy" games was able to use my whole VRAM, then RAM, swap, ... and still be alive :smile: . Or REISUB sysrq sometimes. Because of current issue I definitely want more "magically created RAM".

Test cache behaviour:

drop whole caches (not recommended): sync && echo 3 | sudo tee /proc/sys/vm/drop_caches more free ram, longer game sessions.
fill caches: search/copy/... large amount of files less free ram, shorter game sessions.

p.s. 418.52.10

h1z1 commented 5 years ago

If you can grab /proc/slabinfo or slabtop output that would be helpful. As is the output from grep . /proc/sys/vm/* preferably before and after though that understandably might be hard.

You could for example bump /proc/sys/vm/swappiness as a test, it would tell the kernel to be more active in freeing memory. Your gist doesn't show any swap at all which is odd.

doitsujin commented 5 years ago

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.

The average application doesn't even know or care about how much RAM you have at all.

Someone on the VKx discord found that if VRAM is full, vkAllocateMemory fails even on a memory type that is not device local. This would also explain why #1099 crashes even though memory utilization is very low. This does include VRAM allocated by other applications (window manager, browser, ...), which DXVK has no control over.

pchome commented 5 years ago

@doitsujin

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.

The average application doesn't even know or care about how much RAM you have at all.

Yes, it was not a technical description.

@h1z1

Your gist doesn't show any swap at all which is odd.

swap:512MiB, swappiness:10, swap in my system used only as "fallback", it rarely filled and used as indicator "be ready". Also it's zram.

Well, superposition test still the thing, I able to reproduce the issue running the "1080p" profile. It quits immediately when VRAM got filled. "720p" profile is fine with ~1200/1300MB used/allocated.

I installed 418.49.04, the lowest (IIRC) driver version for my current kernel (5.0.21) and was able to fill whole VRAM (1900+) and have ~2700/2800MB used/allocated during benchmark. Well, it's freshly booted system, so I going to stay on 418.49.04 driver for a while and perform more tests later, to be sure.

telans commented 5 years ago

This is also an issue with Borderlands GOTY Enhanced. Seems to occour when loading new map areas/title sequence. It seems that this does not happen once loaded successfully into a map, until I have been playing for around 15-20minutes. For example, after loading in, traveling between seperate map areas (loading sequence) does not produce a crash no matter how many times you travel. But trying to load a new area after ~10 minutes crashes the game.

Regarding https://github.com/doitsujin/dxvk/issues/1100#issuecomment-503645510, Clearing an already built cache makes the game crash on launch with the same errors nearly every single time until the 3rd or 4th launch. Very strange.

At first I thought this was an issue with Reshade, however it appears that this happens less often with Reshade active. Perhaps this is just placebo.

d3d11.log (note: I removed a few thousand lines of compiling shader outputs, above paste limit)

dxgi.log

lutris/wine/dxvk_debug.log

Specs: i7-4770 GTX 980 Ti Kernel: 5.1.11-arch Driver: 430.26.0 DXVK: 1.2.2 Wine: ge-protonified-4.10 (tested Proton 4.2-7 & Wine 4.9 Staging)

Cheers (side question: Is this a recent development? I've never noticed this with any other games before, although previous DXVK versions have the same error)

doitsujin commented 5 years ago

@telans @Rugaliz Can you test setting the environment variable __GL_AllowPTEFallbackToSysmem=1?

Note that performance will most likely be poor, but this should hopefully work around the crashes.

telans commented 5 years ago

Still crashing, and performance appears to remain the same.

Screenshot_20190620_194947

lutris.log

doitsujin commented 5 years ago

Is Borderlands a 32-bit game? In that case your issue is most likely something else, on Proton you can try PROTON_FORCE_LARGE_ADDRESS_AWARE=1. Some wine builds in Lutris may also support this (it would be WINE_LARGE_ADDRESS_AWARE=1 there).

telans commented 5 years ago

The Enhanced version (remastered/released a couple months ago) I'm playing is 64bit, the remastered versions are also updated to DX11 from DX9.

update: ge-wine does support WINE_LARGE_ADDRESS_AWARE, but this didn't change anything.

edwin-v commented 5 years ago

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.

The average application doesn't even know or care about how much RAM you have at all.

I had the error in D9VK on my system with 32 GB on a 2080 Ti. Both RAM and VRAM were barely 25% used when I got this error. It has nothing to do with availability.

Also interesting is that I can hit the error with BL2 in a couple of minutes, but I've been playing Bloodstained Ritual of the Night for much longer without a problem. Could it be something new that is not included in Proton yet? The errors are also relatively new to D9VK (as in, builds older than Monday 10 June were fine).

SveSop commented 5 years ago

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.

The average application doesn't even know or care about how much RAM you have at all.

I had the error in D9VK on my system with 32 GB on a 2080 Ti. Both RAM and VRAM were barely 25% used when I got this error. It has nothing to do with availability.

The couple of times i have actually had any monitoring up while this crash happened with World of Warcraft and DXVK, the dxvk HUD had a bump in allocated up around 3.6GB-4GB, and nVidia SMI was barely 2GB'ish. This is with RTX2070 8GB card. So yeah, it does not really seem to be ACTUAL resource starvation, but some imaginary problem possibly from the driver perhaps.

Rugaliz commented 5 years ago

I placed __GL_AllowPTEFallbackToSysmem=1 in lutrisas @telans did. Bloodstained still crashes after moving a few screens. performance is pretty much the same though

BloodstainedRotN-Win64-Shipping_d3d11.log

BloodstainedRotN-Win64-Shipping_dxgi.log

doitsujin commented 5 years ago

Could it be something new that is not included in Proton yet? The errors are also relatively new to D9VK (as in, builds older than Monday 10 June were fine).

There have been no memory allocation changes at all for several months. Only 138dde6c3d4458a1d262093b93773b6a90090c40 (from today) changs things a bit, but most likely won't affect this issue at all.

I also somehow doubt that this can be fixed within DXVK since it's the vkAllocateMemory calls that are failing for no apparent reason, no matter which memory type we're trying to allocate from.

telans commented 5 years ago

There have been no memory allocation changes at all for several months. Only 138dde6 (from today) changs things a bit, but most likely won't affect this issue at all.

Yeah, just tried it and still crashed unfortunately.

New lines in log:

err: DxvkMemoryAllocator: Memory allocation failed Size: 53660160 Alignment: 256 Mem flags: 0x7 Mem types: 0x681 err: Heap 0: 1319 MB allocated, 1181 MB used, 6144 MB available err: Heap 1: 857 MB allocated, 766 MB used, 5935 MB available

edwin-v commented 5 years ago

I also somehow doubt that this can be fixed within DXVK since it's the vkAllocateMemory calls that are failing for no apparent reason, no matter which memory type we're trying to allocate from.

As mentioned, I haven't actually run into this one myself with DXVK in Proton 4.2-7. But assuming that D9VK still shares the same memory allocation code, something changed in the last 10 days that made it highly sensitive. Maybe there is a hint there.

telans commented 5 years ago

There are a few people with Proton having similar crashing issues: https://www.protondb.com/app/729040

SveSop commented 5 years ago

Well, found a little snippit to allocate ram via CUDA. https://devtalk.nvidia.com/default/topic/726765/need-a-little-tool-to-adjust-the-vram-size/

#include <stdio.h>

int main(int argc, char *argv[])
{
     unsigned long long mem_size = 0;
     void *gpu_mem = NULL;
     cudaError_t err;

     // get amount of memory to allocate in MB, default to 256
     if(argc < 2 || sscanf(argv[1], " %llu", &mem_size) != 1) {
        mem_size = 256;
     }
     mem_size *= 1024*1024;; // convert MB to bytes

     // allocate GPU memory
     err = cudaMalloc(&gpu_mem, mem_size);
     if(err != cudaSuccess) {
        printf("Error, could not allocate %llu bytes.\n", mem_size);
        return 1;
     }

     // wait for a key press
     printf("Press return to exit...\n");
     getchar();

     // free GPU memory and exit
     cudaFree(gpu_mem);
     return 0;
}

Needs cuda-dev-kit from nVidia (or distro). Compile with: nvcc gpufill.cu -o gpufill

That way you can allocate and "spend" vram without actually spending it.. What happened if i spend 6GB vram, was that WoW started as normal, and did not crash even tho after running around a bit and zoning++ vram was topped out at 7.9GB+ on my 8GB card. Did not crash, not notice any huge issues, but did not test more than maybe 10-15 minutes.

However, using "gpufill" to load 7GB ram ./gpufill 7000 to spend 7GB vram BEFORE starting WoW, something was clearly taxed to system ram instead, cos the performance was horrible. But i still did not crash from that. Screenshot: WoWmem Closing "gpufill" by pressing enter did release 7GB of vram according to nVidia-smi, but there was no change in WoW performance. This atleast indicates that allocated vram -> system ram does not "transfer" back to actual vram even if its freed later. That may well be intended tho, but from what i gather even this experiment did not immediately crash WoW, so the crashing might not REALLY be actual memory allocation problems due to memory starvation.

The "shared memory" thing between vram<->sysram probably does not work the same way that swap does i guess? Ie. in a memory starving situation things gets put to swap on disk, but once memory gets freed, it does not continue to be used from swap. I have no clue what is supposed to happen in a situation like that tho?

Will do some more testing with this, and with the latest https://github.com/doitsujin/dxvk/commit/138dde6c3d4458a1d262093b93773b6a90090c40

SveSop commented 5 years ago

https://github.com/doitsujin/dxvk/commit/138dde6c3d4458a1d262093b93773b6a90090c40 seems an improvement so far.

Doing the same test as above with 7GB memory allocated with "gpufill", WoW loaded and had a lot higher fps, although some stuttering and framespikes.. closing "gpufill" to release 7GB vram brought the frametimes down, and fps up. Fairly playable, but i noticed GPU load was still 90%+ vs if normally where i was standing it usually is 45-50% with 30+ more fps.

~~So for the little testing i did, https://github.com/doitsujin/dxvk/commit/138dde6c3d4458a1d262093b93773b6a90090c40 did help on performance when in a out of vram situation.~~ EDIT: Clearing the .nv/GLCache folder and WoW/retail/Cache folder brought back the same "issues" as https://github.com/doitsujin/dxvk/issues/1100#issuecomment-504068676 it seems..

One other thing i noticed was nVidia-smi seemed to indicate less vram usage from WoW. Is this due to "reuising chunks" so that "actual" vram is not so much?

SveSop commented 5 years ago

Since i am an incredibly slow learner, and a n00b.. Let me just ask this to TRY to get my head around this "allocated" thing. The Cuda app i posted above "allocates" vram from "actual" vram. If i have 7800MB free vram, i can allocate 7800MB, but if i try to allocate 7900MB i get "Error, could not.." So, when i open eg. firefox, it uses (according to nVidia SMI) 79MB. When i play WoW at my current resolution/settings, the app uses 1880'ish MB. This does not vary much, but may vary with spell effects, and possibly when changing "worlds" (ref. expansions and different texture details and whatnot). Simple math according again to nVidia SMI, 1880 (wow) + 79 (firefox) = 1959mb. This means i can allocate 6GB (well.. i could allocate 5960MB with the cuda app).

Reading from DXVK HUD, the "allocation" is 4500+ MB. What is this "allocation", and is this "unlimited"? Is the allocation limited by vram + system ram? (in my case 8 + 16 = 24GB) From the little tests i have done, it is atleast clear that the "allocated" and "used" listed on dxvk hud does not in any way limit me allocating vram with the cuda app, or starting chrome or whatnot. The only thing that actually spew an error message is if i try to use the cuda app to allocate > available vram.

What i don't know is supposed to happen with this "dxvk allocation" is what happens if physical vram is full. From the tests it SEEMS as it will happily use system ram (as i guess this is the intended function). The "allocation" and "used" does not change, but WoW (according to nVidia SMI) uses less physical vram if the game is started in a vram starved situation vs. not. What was rather clear tho, is that it can seem as if once any actual data (textures and whatnot) is put in the system ram, it stays there for some reason. The tests with really starved vram makes the GPU usage 99%, and fps.. a LOT less even after i kill the cuda app, even if i then get 5GB free physical vram. Would it not be ideal if allocation blocks could be freed or moved to vram once vram is free? Or is that not a feature available to vulkan.. or perhaps a driver thing that things dont get "transfered"?

doitsujin commented 5 years ago

Would it not be ideal if allocation blocks could be freed or moved to vram once vram is free?

Indeed, but that would require recreatnig all Vulkan resources that are in system memory, as well as all views for those resources. This is an absolute nightmare, and I have no plans to do that.

DXVK can let the driver do the paging so that it doesn't have to recreate any resources, however that only works on drivers which support VK_EXT_memory_priority and allow over-subscribing the device-local memory heap. On Linux, this currently only works on AMD and possibly Intel drivers.

IngeniousDox commented 5 years ago

SveSop, have you tried completely disabling GLCache with __GL_SHADER_DISK_CACHE=0?

SveSop commented 5 years ago

DXVK can let the driver do the paging so that it doesn't have to recreate any resources, however that only works on drivers which support VK_EXT_memory_priority and allow over-subscribing the device-local memory heap. On Linux, this currently only works on AMD and possibly Intel drivers.

Since this extension IS available for Windows and nVidia, hopefully this COULD be a thing for Linux aswell. IF this happens, would this help in situations like this? Cos to me it kinda seems like somewhat of a drawback if resources ever get put in system ram and never moved back. I wonder if this is somewhat related to what i have tried to describe before - After playing a while (2-3hours +), the performance is worse (less fps) standing at the same spot, but restarting the game will gain back the same performance i had earlier. Maybe over time some stuff gets bumped to sysmem due to the "allocated memory" actually allocating memory outside of vram and decides to put some shit there? Cos as i have kinda proven above - allocation does not seem to have anything AT ALL to do with available vram.

Is it up to the driver not to mess this up? If i have 2GB physical vram, and DXVK allocates 4.5GB, it is feasible to think 2.5GB of that is allocated in system ram, but if i have 8GB vram, it "should" be allocated in vram... but that does not seem to be the way things actually works i guess. Can one blame the driver for putting stuff "where it seems fit", assuming VK_EXT_memory_priority extension is not available?

telans commented 5 years ago

So I assume there aren't any possible ways to temporarily fix this? (aside from reinstalling Windows...) I'm at a point in the game where I can't progess because it always crashes when loading a section of the last available mission, which is a bummer

Let me know if there are any settings you'd like to try, log etc. Like the help resolve this if possible, but I'm not familliar with code much.

ghost commented 5 years ago

Total number of allocations can be limited, not only their size.

doitsujin commented 5 years ago

The limit is something like 4 billion on Nvidia's desktop driver.

That said, even if it was 4096 I'd be surprised if DXVK ran into the issue, the memory allocator is designed to only only do a few hundred allocations at most.

7AndreyPetrov commented 5 years ago

I had same problem with Fallout4 + Proton 4.2-7 + GTX960. PC freezed each 30-40 min, however problem got fixed after disabling TRANSPARENT_HUGEPAGES. Try put transparent_hugepage=never into linux kernel options (grub.cfg).

telans commented 5 years ago

PC crash or game crash with memory allocation errors?

Doesn't this slighty decrease performce with it off?

ionenwks commented 5 years ago

I had same problem with Fallout4 + Proton 4.2-7 + GTX960. PC freezed each 30-40 min, however problem got fixed after disabling TRANSPARENT_HUGEPAGES. Try put transparent_hugepage=never into linux kernel options (grub.cfg).

Can't say I'm surprised, I used to use automatic huge page for tmpfs (with huge=within_size) and it frequently led to my PC fully freezing (randomly) when doing things like building software on tmpfs (took me a while to realize it was the problem). That made me lose faith in the thing and I disabled huge pages completely. The idea behind it isn't bad though, but I'd rather stay away for a while (could be fixed though, I know huge pages are actively being worked on). Transparent huge pages is however a default on a lot of distributions, I'd assume it "usually" works fine, but wine and games perhaps lead to more unusual use-cases.

This doesn't sound like it's related to this issue though.

pchome commented 5 years ago

@ionenwks

No problems for me w/ transparent huge pages enabled.

`$ zgrep TRANSPARENT_HUGE /proc/config.gz` ``` CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE=y CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD=y CONFIG_TRANSPARENT_HUGEPAGE=y CONFIG_TRANSPARENT_HUGEPAGE_ALWAYS=y # CONFIG_TRANSPARENT_HUGEPAGE_MADVISE is not set CONFIG_TRANSPARENT_HUGE_PAGECACHE=y ```

Also, I prefer "huge" ccache on regular hard drive, rather than tmpfs for building software.

Off-topic TL;DR

Even on my ancient PC everything builds very quickly (updates and subsequent rebuilds). I also use `emerge --exclude="package/name" ...` to control build times, and I usually doing (re)builds during my rest/sleep times. Well, this PC got configured over time, I can even build/load/etc. when doing other things -- no freezing, no glitches, even w/o any kernel "interactivity" patches. If there is still any free RAM/Swap, and even then I can do SysRqs. No hard locks ever happen. So, my usual workflow is switching between workspaces where different things are running and only limitation is the free RAM (my swap in tmpfs, x16 times smaller then the whole ram size :) ). Sometimes there's a game running in background, utilizing ~100% CPU/GPU, and I don't lose DE interactivity while doing other things in the mean time. \ ~FYI

Maybe I'll check if transparent_hugepage=never changes anything, on next reboot.

telans commented 5 years ago

Off-topic TL;DR Maybe I'll check if transparent_hugepage=never changes anything, on next reboot.

You can just echo never >/sys/kernel/mm/transparent_hugepage/enabled

However, I did that + kernel param and it seemed to help for 20-30 minutes, but then I crashed again. Not sure if that's luck or not.

Rugaliz commented 5 years ago

I have been able to mitigate the issue as @doitsujin suggested by having everything i don't need closed. So basically i just have desktop environment running (Gnome here) plus lutris and the game. Anything else opened and sooner or later Bloodstained eventually crashes.

liam-middlebrook commented 5 years ago

I placed __GL_AllowPTEFallbackToSysmem=1 in lutrisas @telans did. Bloodstained still crashes after moving a few screens. performance is pretty much the same though

@Rugaliz you tried that out with the 418.74 driver, correct? Would you be able to try the Vulkan Developer Beta 418.52.10 driver?

You won't need to use the __GL_AllowPTEFallbackToSysmem environment variable with that driver. Let me know if that works without you needing to close your other applications.

Joshua-Ashton commented 5 years ago

I placed __GL_AllowPTEFallbackToSysmem=1 in lutrisas @telans did. Bloodstained still crashes after moving a few screens. performance is pretty much the same though

@Rugaliz you tried that out with the 418.74 driver, correct? Would you be able to try the Vulkan Developer Beta 418.52.10 driver?

You won't need to use the __GL_AllowPTEFallbackToSysmem environment variable with that driver. Let me know if that works without you needing to close your other applications.

Unrelated, but do you know when changes in that branch will make it up the the mainstream drivers? There's a few other changes/fixes there that are quite useful for DXVK/D9VK.

h1z1 commented 5 years ago

I've been playing around with the simple code from SveSop above and can replicate crashing of it, problem is it's been random. What window managers are being used? One thing I've noticed with newer drivers is kwin randomly causing corruption in anything GPU related like mpv while X swallows a lot of GPU memory. Example

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 430.14       Driver Version: 430.14       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 108...  Off  | 00000000:43:00.0  On |                  N/A |
|  9%   51C    P0    87W / 280W |   9859MiB / 11178MiB |      4%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     11517      G   X                                           8217MiB |
|    0     38149      G   kwin                                         467MiB |
|    0     68423      G   mpv                                           10MiB |
|    0     89658      G   ...quest-channel-token=4620640961200869647    65MiB |
|    0     98390      G   ...quest-channel-token=4771647170898914487   963MiB |
+-----------------------------------------------------------------------------+

89658 and 98390 are Discord, doing what I have no idea. Point is it's quite possible to have rather large resource swings and quickly. Kwin

telans commented 5 years ago

I placed __GL_AllowPTEFallbackToSysmem=1 in lutrisas @telans did. Bloodstained still crashes after moving a few screens. performance is pretty much the same though

@Rugaliz you tried that out with the 418.74 driver, correct? Would you be able to try the Vulkan Developer Beta 418.52.10 driver?

You won't need to use the __GL_AllowPTEFallbackToSysmem environment variable with that driver. Let me know if that works without you needing to close your other applications.

Doesn't change anything for me going from 430.26 to 418.52.10

Sandok4n commented 5 years ago

I have reproduced errors: WorldOfTanks_d3d11.log WorldOfTanks_dxgi.log

DE: XFCE, memory allocation during start game ~95MB, unfortunately i didn't catch memory use during crash. And here is full run log with all start params: run.log

GitArUs commented 5 years ago

I was experiencing exactly the same memory allocation errors with Final Fantasy XIII and D9VK (it was impossible to load a savegame). The setting which helped was d3d9.evictManagedOnUnlock = True in dxvk.conf. Maybe DXVK needs something similar ?

doitsujin commented 5 years ago

D3D11 has no concept of managed memory, so the D9VK option does not apply here.

Also it's quite likely that you are simply running out of 32-bit address space in Final Fantasy XIII, I sometimes have that problem even with wined3d and d9vk makes it even worse.

liam-middlebrook commented 5 years ago

I'm running into some issues reproducing this locally with 430.26. Could someone who has a fairly consistent repro try today's DXVK release? It looks like @doitsujin added some extra logging for failed allocations that might help provide a better picture of what's going on.

I've tried with Bloodstained: Ritual of the Night and World of Tanks and I'm running latest DXVK (from Git) against Proton 4.2-8

telans commented 5 years ago

@liam-middlebrook Borderlands GOTY Enhanced debug logs with dxvk built from afe2b487a62cc62246926e11723a0277ecc42aca:

Launcher_d3d11.log

Launcher_dxgi.log

I can't see anything different from my previous logs unfortunately

pchome commented 5 years ago

@liam-middlebrook I'm not sure my logs are relevant, but just in case: dxvk-superposition-1280x720-crash.zip All data collected while the crash dialog window is up (while application still running).

Unigine Superposition benchmark with higher quality textures (-textures_quality 2), just to fill the whole VRAM (2GB). RAM: ~300MB free (~5GB in caches/buffers) before the test.

After cleaning caches I able to launch the benchmark with the same params, DXVK HUD reports ~2700/2800MB used/allocated memory. RAM: ~6GB free before the test / ~2GB free during the test

I kind of understand and can accept such behaviour, but ... it looks like "something" just checks free mem, but not trying to allocate it. In opposite case more free RAM should be pushed out of caches by the system ... Well, I'm not good in technical details (and English).

SveSop commented 5 years ago

@pchome

In opposite case more free RAM should be pushed out of caches by the system ... Well, I'm not good in technical details (and English).

This is kind of what i have been trying to ask aswell. I do not really know how vram<->systemram (shared system ram) kinda interlink when it comes to this. I kinda have a idea how this works with system memory and swap tho, and what i believe there is that the system will move "less used shit" to swap when you get in a low system memory state (kernel tunable, but even so). If you THEN close whatever ram hungry app and start accessing apps that have memory on swap, this will be moved back to free system memory again. Might not happen immediately (and probably kernel tunable too), but you wont end up in a situation where everything is running like a -386 cos its constantly reading from swapfile, with 12GB unused physical memory.

As i said a few posts up, it might NOT be intended to work this way when it comes to vram and shared-system ram and whatnot for graphics related apps... but if it IS intended that free vram means "move data from system ram -> vram" (like it happens with swap), this does clearly not happen with DXVK.

Is it a "system" thing? "Driver" thing? "Vulkan" thing? "Intended behavior" thing? :)

pchome commented 5 years ago

Googled an example how to quickly fill cache : find . -type f -exec cat {} + > /dev/null

So, just run watch free -m in the one terminal and the "find" command in another, then stop the "find" command when "free" mem will be small enough. Then try to use DXVK. This should help systems w/ "a lot of" RAM quicker reach the behaviour described above.

ref: Experiments and fun with the Linux disk cache

p.s. the command just "reads" files from current directory into /dev/null, even if it looks scary.

mozo78 commented 5 years ago

Here are my logs from Frostpunk which is constantly crashing in ~20 minutes: Frostpunk_dxgi.log Frostpunk_d3d11.log

Dvakote commented 5 years ago

Kernel: 4.19.49-1-MANJARO Proton 4.2-9 CPU: AMD Ryzen 5 1600 Six-Core Processor GPGPU: NVIDIA Corporation GeForce GTX 960 (2GB) Driver: 4.6.0 NVIDIA 430.14 RAM: 16053 MB

Game: VALKYRIE DRIVE -BHIKKHUNI- (steam id 550080) Game randomly crashes during loading with (err: DxvkMemoryAllocator: Memory allocation failed) I looked into logs and decided that it's better to go with this problem directly to dxvk issues. Proton log: steam-550080.log VD_BHIKKHUNI_dxgi.log VD_BHIKKHUNI_d3d11.log Tell me if I missed something.

JonasKnarbakk commented 5 years ago

I get a pretty guaranteed crash now when loading into World of Warcraft. Saw you added more debugging info in the latest commits so I did a build from master in hopes that I could be helpful in locating the issue.

Kernel: 5.1.15-arch1-1-ARCH Cpu: AMD Ryzen 7 2700X Gpu: GeForce GTX 980: Driver: 430.26.0 Vulkan: 1.1.99 Wine version: ge-protonified-4.10-x86_64

Wow_d3d11.log Wow_dxgi.log

I set all the debugging environment variables I could see in the readme. Tell me if I'm missing anything, I'm fairly certain that I can reproduce the crash by logging in to the same character.

doitsujin / dxvk

Games crash on Nvidia due to memory allocation failures #1100

1099 (Bloodstained: Ritual of the Moon)

1087 (World of Warcraft)