Open NathanielMiddleton opened 2 years ago
Writing a note here that the current presentation of "free" memory is using "freemem" vs "available" memory, which presents the consumed memory incorrectly in terms of what is available for jobs landing on nodes.
example: Hostname Partition Node Num_CPU CPUload Memsize Freemem Joblist n799 partition maint 0 112 0.06 515466 22191
But that node has 497G of ram available: n799 ~]# free -m total used free shared buff/cache available Mem: 515466 16546 22183 365 476736 497180
Note- This is actually coming from slurm itself as they do not track "available" memory. If you agree that this should be changed, bug id is: 15077
side note: super neat tools you have here! Thank you for spending the time on them!
Writing a note here that the current presentation of "free" memory is using "freemem" vs "available" memory, which presents the consumed memory incorrectly in terms of what is available for jobs landing on nodes.
example: Hostname Partition Node Num_CPU CPUload Memsize Freemem Joblist n799 partition maint 0 112 0.06 515466 22191
But that node has 497G of ram available: n799 ~]# free -m total used free shared buff/cache available Mem: 515466 16546 22183 365 476736 497180
Note- This is actually coming from slurm itself as they do not track "available" memory. If you agree that this should be changed, bug id is: 15077
side note: super neat tools you have here! Thank you for spending the time on them!