Closed MaximeVdB closed 3 years ago
The size
is what ps
would return, as per its man page:
size SIZE approximate amount of swap space that would be
required if the process were to dirty all
writable pages and then be swapped out. This
number is very rough!
Thanks, so then I take it that the values of the "size (kb)" and "%mem" columns will only follow the above relationship in certain situations, and not in general? And that "%mem" is (usually) more relevant than "size (kb)" in assessing the actual memory usage (to e.g. avoid out-of-memory errors)? If so, then I think this should be made clear in the documentation of monitor
.
The following script provides an extreme example, where '%mem' is zero whereas the 'size' exceeds the 192 GB of available memory on the machine:
from time import sleep
import numpy as np
num = 100000 # array sizes of around 80 GB
x = np.empty((num, num), dtype=np.float64)
y = np.empty((num, num), dtype=np.float64)
z = np.empty((num, num), dtype=np.float64)
sleep(300)
Well, not quite. The example that you mention is precisely something where size may come in handy. The memory is reserved and could be filled very quickly, perhaps faster than monitor is going to pick up on (depending on delta). So the application crashes while %mem is still reasonable, but in fact the memory is exhausted between sample points. Looking at size tells you that this could indeed have happened.
On Fri, Nov 20, 2020 at 11:53 AM MaximeVdB notifications@github.com wrote:
Thanks, so then I take it that the values of the "size (kb)" and "%mem" columns will only follow the above relationship in certain situations, and not in general? And that "%mem" is (usually) more relevant than "size (kb)" in assessing the actual memory usage (to e.g. avoid out-of-memory errors)? If so, then I think this should be made clear in the documentation of monitor.
The following script provides an extreme example, where '%mem' is zero whereas the 'size' exceeds the 192 GB of available memory on the machine:
from time import sleepimport numpy as np num = 100000 # array sizes of around 80 GBx = np.empty((num, num), dtype=np.float64)y = np.empty((num, num), dtype=np.float64)z = np.empty((num, num), dtype=np.float64)sleep(300)
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/gjbex/monitor/issues/4#issuecomment-731097612, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABEUGOEZVGTPSROXSIPDC53SQZDDPANCNFSM4T3RXABA .
--
dr. Geert Jan Bex HPC consultant/analist Docent - Lecturer Data Science Institute
T +32(0)11 26 82 31
www.uhasselt.be Universiteit Hasselt - Campus Diepenbeek Agoralaan Gebouw D - B-3590 Diepenbeek Kantoor D250a
I did not mean to say that the 'size (kb)' value is not useful -- it definitely is! It's just that people might think it refers to the amount of (physical) memory in actual use by the process, which is not true (or only in specific cases). For this information, one should rather look to the '%mem' value instead.
(For the record: the 'extreme' example above does not crash, since there is not necessarily a problem when the virtual memory usage exceeds the actual amount of available physical memory. But if it would crash (when starting to actually use physical memory for those arrays), then the "size (kb)" value is indeed helpful to find out what happened.)
I've adapted the README a bit (development branch only). Can you have a look whether that would clarify the issues you had?
Thanks.
Looks good, thanks!
Hello Geert Jan,
Though not explicitly stated in the README, one somewhat expects that the "size (kb)" and "%mem" columns would be related as
This relation is also implied in this document around page 105: https://hpcugent.github.io/vsc_user_docs/pdf/intro-HPC-linux-leuven.pdf
However, this does not always seems to hold. At least, with this little Python script,
I get the the following output, on a machine with 192 GB memory:
Based on the relationship above, one would expect a "size (kb)" value of about 192*0.004*1000*1000 = 768000. The listed value of 1975180, however, is 2.5 times higher than this.
Could it perhaps be related to the difference between the "virtual' and 'resident' memory use? I.e. that the 'size (kb)' column is referring to virtual memory and the '%mem' column is based on the resident memory usage? At least, that seems to be consistent with the output from
top
: