giampaolo / psutil

Cross-platform lib for process and system monitoring in Python
BSD 3-Clause "New" or "Revised" License
10.3k stars 1.39k forks source link

[Windows] Incorrect values in swap_memory - percent and used #2431

Open Andrej730 opened 3 months ago

Andrej730 commented 3 months ago

Summary

Description

I used this snippet for tests:

import psutil
from psutil._pswindows import cext

to_gb = lambda b: b/(2**30)
mem = cext.virtual_mem()
total_phys = mem[0]
free_phys = mem[1]
total_system = mem[2]
free_system = mem[3]

total_swap = total_system - total_phys
free_swap = free_system - free_phys
used_swap = total_swap - free_swap
print("used calculated manually", to_gb(used_swap))

print("\ncalculating using percent")
print("used", to_gb(cext.swap_percent() * total_swap * 0.01))
print("%", cext.swap_percent())

print("\nswap memory data")
print("used", to_gb(psutil.swap_memory().used))
print("%", psutil.swap_memory().percent)

And here's the results I've got on py 3.11 + windows 11:

used calculated manually 5.351310729980469

calculating using percent
used 0.8192634582519531
% 1.6778515625000001

swap memory data
used 0.8192634582519531
% 1.7

From what I see in task manager number 5.35GB is much better resembles the truth (as you an see in task manager 13.0GB of physical memory in use and 18.4GB of all system memory in use, which gives us 18.4 - 13.0 = 5.4 GB). So manual calculation works better in this case than the current way that uses the percent.

image


Results from Windows 11 + python 3.12:

used calculated manually 5.321750640869141

calculating using percent
used 0.79473876953125
% 1.6276249999999999

swap memory data
used 0.79473876953125
% 1.6

Manual calculations are still much more accurate.


I've also tried the same script on Windows 10 + python 3.12:

used calculated manually 53.104225158691406

calculating using percent
used 0.5125198364257812
% 0.17494010416666667

swap memory data
used 0.5125198364257812
% 0.2

It seems that there is some other issue at play. Manual calculations and percent calculations looks similar and you can see from the screenshot below that percent calculations are actually more accurate (167-116=51GB) BUT swap_percent seems divided by 100 by accident, so correct swap_percent() would be 17.494%... giving us 51.252 gb, instead we get 0.2% and 0.5125 GB.

image

As a workaround on this machine I currently replaced percentswap = cext.swap_percent() with percentswap = cext.swap_percent() * 100.

As this issue is probably related to #2160, ping @dbwiddis

dbwiddis commented 3 months ago

From what I see in task manager number 5.35GB is much better resembles the truth

If the value you want is "committed", then that accurately represents it. However, that doesn't actually measure the amount used, consistently over-estimating and leading to calculations where swap used exceeds swap total. See #2074 for several examples where math as you propose produced those results.

Windows reserves memory (decrementing "virtual_free" which is the sum of phsyical and swap free) before it even allows a program to proceed past the malloc() call, expanding the paging file if necessary to make sure there's always enough space. Unlike Linux, it's impossible for Windows to allocate memory to a program that it hasn't already reserved; the physical memory + paging file together always exceed the total amount reserved by any program. However, even if that memory is reserved, it isn't "used" (charged against programs) until the memory is actually accessed. Thus the "used" reported by task manager is actually just "reserved".

Presently psutil reports the percentage that comes from the Windows Performance counter. Rather than trying to do math from task manager, open up perfmon on your machine and look for the performance counter for "Paging File(_Total)\% Usage".

I humbly submit that is the "truth" that psutil should (and does) match.

The same value in WMI can be calculated from Win32_PageFileUsage, which is also used to unit-test the PDH counter code, where CurrentUsage is the numerator and AllocatedBaseSize is the denominator:

CurrentUsage

Amount of disk space currently used by the page file.

AllocatedBaseSize

Actual amount of disk space allocated for use with this page file.

This interpretation exactly matches how Linux calculates swap file usage, ignoring the aggressive/pre-emptive over-allocation (commit limit) that you see on Task Manager. Given that psutil is a cross-platform program, I believe consistency across operating systems takes priority over matching the Task Manager. (Windows takes several other liberties with the task manager on CPU usage, but that's a totally different subject!)

It seems that there is some other issue at play... swap_percent seems divided by 100 by accident

It's not by accident, it's rather intentional. Percentage is a fraction between 0 and 1, and only gains the 0 to 100 interpretation when the percent symbol is used as a "unit". Try opening an Excel spreadsheet and typing 0.5 in a cell. Then "format" it using percentage and you'll see it's 50%.

Yes, I understand that a variable named "percent" is expected to be from 0 to 100 and used that assumption in https://github.com/giampaolo/psutil/pull/2160/commits/6dc64c46ad970fe278008c77607a3839f4186807 but realized unit tests were failing because psutil already reported a fraction, so I had to revert that change in https://github.com/giampaolo/psutil/pull/2160/commits/0b0ad4964286c2a2e7e925d6a1a0b0841d4ee0b9 to restore compatibility with the existing usage. It's confusing, but it's always been that way.

TLDR:

dbwiddis commented 3 months ago

From what I see in task manager number 5.35GB is much better resembles the truth (as you an see in task manager 13.0GB of physical memory in use and 18.4GB of all system memory in use, which gives us 18.4 - 13.0 = 5.4 GB).

The 13.0 GB is not physical memory used, it's the total virtual memory, the combined physical + swap "in use".

The 0.82 GB swap used is a portion of the 13.0GB; you have 12.18GB of physical memory used.

This combined 13.0 GB is a fraction of the 18.4GB physical+swap "committed", which is a fraction of the 64.5GB physical+swap "total".

Andrej730 commented 3 months ago

It's not by accident, it's rather intentional. Percentage is a fraction between 0 and 1, and only gains the 0 to 100 interpretation when the percent symbol is used as a "unit". Try opening an Excel spreadsheet and typing 0.5 in a cell. Then "format" it using percentage and you'll see it's 50%.

So in the Windows 10 case it's just an accident that small number of used memory I get (0.5125GB) resembles calculated used (I guess commited) memory (53.104GB) divided by 100 times and percent 0.2% is actually correct and shouldn't be 17.494%?

The 13.0 GB is not physical memory used, it's the total virtual memory, the combined physical + swap "in use".

I'm not sure if it's true, since 13.0GB is the same number you see in this another window of task manager and I've never seen it going beyond 15.7GB (which it's physical limit) - though this what it would do if it had also swap in use.

image

Here's example - program is allocating 1 gb multiple times and it's 1 GB is adding to "in use" until there is no more physical memory and it switches to swap file. Btw is this also a case of "reserved" and not "used" memory?

https://github.com/user-attachments/assets/44646a8d-abd7-4cfb-9e0d-24d5da52c5ce

dbwiddis commented 3 months ago

So in the Windows 10 case ....

Without a full description of actual system values (physical memory size, swap file size) I can only go on what's in the screen cap. But the overall point here is that Windows memory stats on the Task Manager reference the "commit limit" and "committed" memory.

As long as you have free physical memory, it's probably going to be used, and you'll have minimal swap file usage. Only when you exceed physical memory limit does Windows have to decide what to keep in RAM and what to put on disk in the swap file; programs don't know the difference, and Windows will page swap as needed to optimize the actual location of the same virtual address.

I've watched your video but I still don't know your actual physical memory limit or how big your swap file is to give context on the interpreation. But generally it looks like your bytearray(size) is adding to both committed and in use; plus a bit more overhead added to committed.

From a Task Manager perspective:

I'm not sure if it's true, since 13.0GB is the same number you see in this another window of task manager

I honestly don't trust most Task Manager displays.

Even for program memory usage, you have to go into "Working Set" and the default shown isn't even the one you want, it's another click away. It's like they intentionally want to hide the actual memory used.

Resource Monitor or PerfMon give actual counter values; Task Manager is an attempt to show you a limited amount of information for a very high level idea of resource usage.

Andrej730 commented 3 months ago

Without a full description of actual system values (physical memory size, swap file size) I can only go on what's in the screen cap. But the overall point here is that Windows memory stats on the Task Manager reference the "commit limit" and "committed" memory.

As long as you have free physical memory, it's probably going to be used, and you'll have minimal swap file usage. Only when you exceed physical memory limit does Windows have to decide what to keep in RAM and what to put on disk in the swap file; programs don't know the difference, and Windows will page swap as needed to optimize the actual location of the same virtual address.

Not sure if that helps but this system has 192GB RAM + 292GB swap, on the screenshot it's used 116 GB RAM as RAM and 76GB RAM as shared VRAM (can't see it on the screenshot, it's visible only on GPU tab), so all RAM is used and it should have started to use swap file - and that it's just using 0.5GB swap file in that case, as psutil shows, when the pc is currently under a very heavy load and all actual RAM is used seems very off.


Okay, even outside task manager, just within psutil isn't it kind of inconsistent that allocating memory seems to be "used" when it's part of physical memory but when we're out of physical memory new allocated memory is considered not used but, I guess, "commited" / "reserved" and we don't really see it in the swap_memory().used?

import psutil
gb = 2**30
to_gb = lambda b: b/gb

m = []
for i in range(30):
    m.append(bytearray(gb))
    print(to_gb(psutil.virtual_memory().used), to_gb(psutil.swap_memory().used))

# 9.743854522705078 1.6344375610351562
# 10.762199401855469 1.6344375610351562
# 11.760143280029297 1.6344375610351562
# 12.759044647216797 1.6344375610351562
# 13.756885528564453 1.6343154907226562
# 14.759471893310547 1.6343154907226562
# 15.367023468017578 1.6166534423828125
# 15.36703872680664 1.6166534423828125
# 15.3673095703125 1.6155014038085938
# 15.367111206054688 1.6155014038085938
# 15.36712646484375 1.6150856018066406
# 15.366615295410156 1.6150856018066406
# 15.367206573486328 1.6149635314941406
# 15.696582794189453 1.6122550955042243
# 15.631362915039062 1.6257438659667969
# 15.685523986816406 1.6257438659667969
# 15.658435821533203 1.6238098135218024
# 15.672370910644531 1.6848297119140625
# 15.639583587646484 1.687057494185865
# 15.672103881835938 1.8699607839807868
# 15.388587951660156 2.4693870544433594
# 15.384143829345703 2.745319366455078
# 15.414592742919922 2.7424049377441406
# 15.646942138671875 2.7895278930664062
# 15.66757583618164 3.0128555297851562
# 15.708641052246094 3.005725860595703
# 15.686126708984375 3.0018234252929688
# 15.641899108886719 3.02587890625
# 15.6419677734375 3.1997299194335938
# 15.683547973632812 3.2514152517542243
dbwiddis commented 3 months ago

Not sure if that helps but this system has 192GB RAM + 292GB swap, on the screenshot it's used 116 GB RAM as RAM and 76GB RAM as shared VRAM (can't see it on the screenshot, it's visible only on GPU tab)

The 116GB "in use" on the screen shot doesn't differentiate between physical or swap memory. From the OS perspective, there is 484GB of addressable memory. Programs don't care whether the page they are accessing is in RAM or on disk (and in fact it may move back and forth between them without the program knowing or caring).

I'm not sure (I doubt it) whether GPU memory is relevant to this whole discussion.

so all RAM is used and it should have started to use swap file - and that it's just using 0.5GB swap file in that case, as psutil shows, when the pc is currently under a very heavy load and all actual RAM is used seems very off.

The 116 GB "in use" does not exceed the 192 GB of physical RAM. There is still no need to use the swap file.

Allocate memory to programs to the point that "in use" exceeds physical RAM size and you will see an increase in swap file usage.

Reserve memory to the point where "commit limit" reaches total virtual space (167/484 becomes 484/484) and you will see swap file get bigger to make it 484/512 (or something larger)).

Okay, even outside task manager, just within psutil isn't it kind of inconsistent that allocating memory seems to be "used" when it's part of physical memory but when we're out of physical memory new allocated memory is considered not used

No. "Used" has zero relevance to the location of the memory. "Used" memory might be in RAM or it might be on disk. Individual pages might move back and forth between them all while remaining being used.

but, I guess, "commited" / "reserved" and we don't really see it in the swap_memory().used?

Correct. "Used" does not include the commit limit. The commit limit is a Windows-only thing for ultra-conservative safety to prevent OOM crashes of the OS, and psutil tries to keep the "used" interpretation consistent across operating systems.

Andrej730 commented 3 months ago

No. "Used" has zero relevance to the location of the memory. "Used" memory might be in RAM or it might be on disk. Individual pages might move back and forth between them all while remaining being used.

But isn't the last snippet I've showed is making a program to use 30 GB of memory and some of those 30 GB are not present neither in virtual_memory().used and swap_memory().used? Isn't this an issue that memory is used but never counted?

Or I'm still missing something in definition of "used" memory and there is something else need to be done to make it "used" and not "reserved"?

Reserve memory to the point where "commit limit" reaches total virtual space (167/484 becomes 484/484) and you will see swap file get bigger to make it 484/512 (or something larger)).

From what I've seen swap file is never really increases until user will increase it in the system, e.g. if I try to allocate more than 64 GB I have I get MemoryError.

dbwiddis commented 3 months ago

But isn't the last snippet I've showed is making a program to use 30 GB of memory and some of those 30 GB are not present neither in virtual_memory().used and swap_memory().used?

It's confusing to me because your original screencaps showed two different systems, one in English with 64.5GB total virtual memory, and another system with 484; you've alternated between talking about the larger system with 192GB RAM + 292GB swap and then posting a program that seems to max out physical at a much lower number, that you still haven't told me what that physical memory limit is.

I do not know the fine-grained details of what counts as "reserved" vs. "used" at the operating system level, I am only interpreting what I read in the documentation. I attempted to summarize this whole thing succinctly in these comments in psutil source related to the code we're discussing: https://github.com/giampaolo/psutil/blob/c034e6692cf736b5e87d14418a8153bb03f6cf42/psutil/_pswindows.py#L257-L263

TLDR:

  1. The "commit limit" is relevant as the sum of physical RAM and swap files.
  2. "Committed" memory is "reserved" and not used or reported by psuitl, and is a windows-only interpretation. When "committed" hits the commit limit, you have to either expand swap, or fail to allocate memory.
  3. "Used" memory is actually mapped in a program to a variable or other way that it can be read/written/etc. beyond the original allocation. Used is always <= committed. Task Manager doesn't tell you where the used memory resides. It can be RAM or Disk or both, or moving back and forth in between when you measured it.

From what I've seen swap file is never really increases until user will increase it in the system, e.g. if I try to allocate more than 64 GB I have I get MemoryError.

There's a check box to allow Windows to "Automatically manage paging file size". You clearly don't have that box checked. If you did your paging file would grow and avoid the MemoryError.

Andrej730 commented 3 months ago

It's confusing to me because your original screencaps showed two different systems, one in English with 64.5GB total virtual memory, and another system with 484; you've alternated between talking about the larger system with 192GB RAM + 292GB swap and then posting a program that seems to max out physical at a much lower number, that you still haven't told me what that physical memory limit is.

Sorry, this was the one with 16GB RAM + 48 GB swap, so 30 GB exceeds the RAM here and starting to use swap memory. Would you agree in this case that this 30GB is actually used but current metrics miss some part of it?

I'm not sure (I doubt it) whether GPU memory is relevant to this whole discussion.

There is a shared GPU memory when GPU can use up to 50% of RAM as VRAM (but it seems to never use swap file though) and it never appeares to be "in use" in task manager though it is used 🤔 . I guess it's a separate subject and it can make everything even more confusing so let's avoid it.

And sorry if this conversation is getting too long, it's been very insightful for me so far 😄

dbwiddis commented 3 months ago

Sorry, this was the one with 16GB RAM + 48 GB swap, so 30 GB exceeds the RAM here and starting to use swap memory.

Got it.

Would you agree in this case that this 30GB is actually used but current metrics miss some part of it?

I don't know the details of what counts as "used" vs. "committed". I do know the interpretation of "used" is consistent with the Task Manager "In Use" value which is a total of both physical and swap "in use", and I do know that the "committed" total isn't represented anywhere in psutil.

This link may be helpful in explaining technical details: https://scorpiosoftware.net/2023/04/12/memory-information-in-task-manager

dbwiddis commented 3 months ago

So a correction to my earlier comments. It does appear the Task Manager "in use" is only the physical. However, the comments about "Commit Limit" remain. So it appears Task Manager displays these values: