IGCIT / Intel-GPU-Community-Issue-Tracker-IGCIT

IGCIT is a Community-driven issue tracker for Intel GPUs.
GNU General Public License v3.0
115 stars 4 forks source link

Windows system and virtual memory leak #616

Closed Pleune closed 4 months ago

Pleune commented 9 months ago

Checklist [README]

Application [Required]

Eventual after long term use

Processor / Processor Number [Required]

Intel 12600k

Graphic Card [Required]

A770LE

GPU Driver Version [Required]

31.0.101.4953

Rendering API [Required]

Windows Build Number [Required]

Other Windows build number

No response

Intel System Support Utility report

ssulog.txt

Description and steps to reproduce [Required]

Leave the computer on for a longer time using it occasionally. Over time both the physical memory and paged pool numbers grow dramatically, although task manager does not attribute this to any application.

If opening the windows Performance Monitor, and attaching the "GPU Process Memory - Total Committed" I can see quite a bit of memory tied up across a handful of processes. The graph below is scaled to show all GPU Process Memory counters scaled 0-1 GB.

image

and raw text data: perfmon gpu process memory total committed props.txt

It takes usually about 2 weeks to become an issue for me, where around 16GB of system memory is leaked, at which point the paged pool is usually around 50GB. Doing some googling, I have found this issue link here: https://community.intel.com/t5/Graphics/arc-a380-a750-qsv-enc-sys-paged-pool-memory-leak/td-p/1465974, which may be related. I do run Sunshine, a remote control server application, as well as Parsec which both stay running in the background. However, I do not have to connect to the computer during the time period causing any encoding to be done, in order for the leak to still build up.

image image image image image

I don't really know how to find more log information that could be helpful. If there is anything to add, I would be happy to capture it before I have to reboot the next time.

Also note I am one major driver release behind, simply because the computer needs to be running for such a long time for the issue to be noticeable.

Device / Platform

No response

Crash dumps [Required, if applicable]

No response

Application / Windows logs

No response

Arturo-Intel commented 9 months ago

@Pleune thank you for the report How much a "long time" it is? hours, days, weeks?

--r2

Pleune commented 9 months ago

Its becomes a problem on my system after a couple weeks

abrfilho commented 9 months ago

I'll try to monitor the same thing here, when I play Warzone 2, after some time (random time) the game crashes, checking Event Viewer it says that the process was closed due to lack of memory, I have 32GB RAM and 12800MB of page file. Never thought that this could be caused by the GPU,

Arturo-Intel commented 9 months ago

@abrfilho normally we would request to open a new case for Warzone 2 + Game crash, but it makes sense the crash you are seeing is related to this thread.

What GPU model you have? what CPU? Can you share me your SSU log?

abrfilho commented 9 months ago

@Arturo-Intel igcit_ssu.txt

I have an Arc A770 from Acer paired with a Ryzen 5 5600.

Karen-Intel commented 9 months ago

@abrfilho TY for the info. We're already setting up a system and our testing will go for some weeks as you indicated.

We will let you know our results here once is done

Karen

Ilya-intel commented 8 months ago

@abrfilho FYI, I'm still testing the system with reported software. No problems so far, but I'll keep looking.

abrfilho commented 8 months ago

@abrfilho FYI, I'm still testing the system with reported software. No problems so far, but I'll keep looking.

Thanks for the feedback. I usually play with Discord open, but this is the only game that this happens. I'll try to get a report from Performance Monitor like the OP posted.

abrfilho commented 8 months ago

I was trying to play yesterday with a friend, it was almost impossible.

image image

English: Windows successfully diagnosed an insufficient virtual memory condition.

I have 32GB of physical memory and my virtual memory is managed by the system, changing the in-game setting for VRAM usage does nothing. Perhaps I'm suffering from other problem? I don't know...

Ilya-intel commented 8 months ago

@abrfilho I got a strong feeling that this issue is caused by Page File size. I'd advise you to check it and see if the value is set to manual. Setting to Auto usually resolves such issues. Settings -> System -> About -> Advanced System Settings -> Advanced Tab -> Performance Settings -> Settings -> Advanced Tab (again lol) -> Virtual Memory -> Change

abrfilho commented 7 months ago

@abrfilho I got a strong feeling that this issue is caused by Page File size. I'd advise you to check it and see if the value is set to manual. Setting to Auto usually resolves such issues. Settings -> System -> About -> Advanced System Settings -> Advanced Tab -> Performance Settings -> Settings -> Advanced Tab (again lol) -> Virtual Memory -> Change

My memory were set to auto, I changed to a min and max value of 16GB and 32GB, mitigated the problem, but didn't solve. I noticed that the game starts reporting VRAM consumption around 50% and it keeps increasing, when it's almost at the limit, the game textures glitches.

Ilya-intel commented 6 months ago

@Pleune , So, I've been testing this behavior for couple of months on three different system with Arc Graphics and none of them showed memory leaks, it was stable for ~2 weeks. Which makes me think there is something else behind your issue. Have you tried Windows reinstall or at least checking of system files integrity? Also, worth trying latest 101.5333 driver : https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html

Pleune commented 6 months ago

@Pleune , So, I've been testing this behavior for couple of months on three different system with Arc Graphics and none of them showed memory leaks, it was stable for ~2 weeks. Which makes me think there is something else behind your issue. Have you tried Windows reinstall or at least checking of system files integrity? Also, worth trying latest 101.5333 driver : https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html

Thank you for testing this. This is something that is still happening on my computer, but it has slowed significantly recently, possibly from newer drivers. I have tried to figure out what on my pc triggers it without success, although it does not occur if I remove the card and use only integrated graphics. My best guess is still that sunshine, parsec, or jellyfin are causing the leak while creating encoding/decoding contexts, as this is the only thing going on in the background that is using the card, afaik.

My workaround has been to significantly increase my pagefile to allow windows to allocate ~100G of buildup, and just make sure to reboot every now and then.

I will try a windows reinstall soon. Even though I have run DDU and have tried to keep things tidy, this is still an install has had Nvidia drivers in the past, so perhaps it is a possibility that something is lingering.

abrfilho commented 6 months ago

And I still have problems with Warzone 2, I increased the page file, this only made the problem take longer to happen, it starts with some texture corruption then the game crashes, the VRAM usage starts around 50% and keep increasing.

Pleune commented 6 months ago

And I still have problems with Warzone 2, I increased the page file, this only made the problem take longer to happen, it starts with some texture corruption then the game crashes, the VRAM usage starts around 50% and keep increasing.

I think this may be a different issue. I have never had vram OOM issues, although I don't play warzone. Just the usage of virtual memory space on the windows side

Ilya-intel commented 6 months ago

And I still have problems with Warzone 2, I increased the page file, this only made the problem take longer to happen, it starts with some texture corruption then the game crashes, the VRAM usage starts around 50% and keep increasing.

VRAM usage is a different story, you can create a separate thread, if you want us to dive deeper here. But AFAIK we haven't seen such issues with Warzone 2, so it might be OS related.

abrfilho commented 6 months ago

And I still have problems with Warzone 2, I increased the page file, this only made the problem take longer to happen, it starts with some texture corruption then the game crashes, the VRAM usage starts around 50% and keep increasing.

VRAM usage is a different story, you can create a separate thread, if you want us to dive deeper here. But AFAIK we haven't seen such issues with Warzone 2, so it might be OS related.

The system dumps the occupied but unused VRAM into the virtual memory, leading to the error. I'll try a system repair to see if anything changes.

Ilya-intel commented 5 months ago

Hey @abrfilho , Any good news?

abrfilho commented 5 months ago

Hey @abrfilho , Any good news?

Hello, sorry for the delay!

I played Warzone this week, one time using the lowest setting, no problem and a stable VRAM usage. When I put the texture in a higher config, the VRAM usage will increase during the matches until it reaches the maximum allowed inside the settings, but this time the game didn't crash, I played using the latest driver (5382). I didn't check if the game was dumping into the virtual memory, this was the behavior before.

Ilya-intel commented 5 months ago

Hi @Pleune , Any updates? Have you tried OS reinstall?

Pleune commented 4 months ago

I have not changed anything in my system other than automatic windows updates and occasionally updating the Intel graphics drivers. It's hard to say when it stopped happening, but, I currently do not see any memory issue. My paged pool in task manager has stayed at near zero since I last rebooted and updated the driver about a week ago. Precious to that it had been running for I believe two months. During that time I did not specifically check if memory was growing, but it never presented as a problem.

It seems to me that my original issue has either been fixed, or significantly reduced! I know others have chimed in here so I won't immediately close in case those issues still exist. But I would recommend closing otherwise 👍🏼

Ilya-intel commented 4 months ago

@IGCIT Seems like the issue is fixed with the latest driver. Kindly close this thread.