netniV / cacti-netsnmp-memory

This script template is intended to overcome these shortcomings by fetching all of the available memory data from all known sources (including the standard HOST MIB), and then performing basic arithmetic to fill in any gaps in the data.
4 stars 2 forks source link

getting usedReal greater than memTotalReal #9

Closed nuno-silva closed 4 years ago

nuno-silva commented 4 years ago

usedReal is getting bigger than memTotalReal, even with little swap usage:

current_b current_0

Reverting #6 seems to fix it, so there may be a problem with it:

reverted_b reverted_0

Here's some example data from my RPi Zero device:

.1.3.6.1.2.1.25.2.2.0  HOST-RESOURCES-MIB::hrMemorySize.0 = INTEGER: 443076 KBytes
.1.3.6.1.4.1.2021.4.5.0  UCD-SNMP-MIB::memTotalReal.0 = INTEGER: 443076 kB
.1.3.6.1.4.1.2021.4.6.0  UCD-SNMP-MIB::memAvailReal.0 = INTEGER: 29188 kB
.1.3.6.1.4.1.2021.4.3.0  UCD-SNMP-MIB::memTotalSwap.0 = INTEGER: 102396 kB
.1.3.6.1.4.1.2021.4.4.0  UCD-SNMP-MIB::memAvailSwap.0 = INTEGER: 98812 kB
.1.3.6.1.4.1.2021.4.14.0  UCD-SNMP-MIB::memBuffer.0 = INTEGER: 42620 kB
.1.3.6.1.4.1.2021.4.15.0  UCD-SNMP-MIB::memCached.0 = INTEGER: 302084 kB
root@zero:~# cat /proc/meminfo | grep -E "Mem|Swap[TF]|^Cache|Buff"
MemTotal:         443076 kB
MemFree:           29140 kB
MemAvailable:     318788 kB
Buffers:           42620 kB
Cached:           302084 kB
SwapTotal:        102396 kB
SwapFree:          98812 kB

Note that the new OID for totalReal used in #6 gives me exactly the same result, so the issue must be in the new computation of usedReal:

https://github.com/netniV/cacti-netsnmp-memory/blob/f00431317331ee3bb03a961b1050baca311f4716/scripts/ss_netsnmp_memory.php#L116-L117

Indeed, (443 076 - (29188 - 42620 - 302084)) = 758592, which is greater than my 443076 KB of RAM.

Contrary to https://github.com/netniV/cacti-netsnmp-memory/pull/6#issue-385059158, my availReal is not the result of /proc/meminfo MemAvailable + buffer + cache: 29188 != 318788+42620+302084 = 663492. Rather, memAvailReal seems to be meminfo's MemFree, so the old computation seems just fine.

nuno-silva commented 4 years ago

@gjimenezf can you perhaps try increasing the memory usage on one of your servers and see if this happens to you?

nuno-silva commented 4 years ago

Here's an example from another cacti instance:

hive

hive htop at 3:24

netniV commented 4 years ago

So does this mean #6 needs reverting?

nuno-silva commented 4 years ago

Yes, at least partially. I would wait for @gjimenezf's input though. Maybe he's running some shiny new kernel or snmpd that changed way the values are reported and if that's the case more changes are needed.

netniV commented 4 years ago

We have had no feedback from @gjimenezf so, I guess its our call @nuno-silva?

nuno-silva commented 4 years ago

Yes, please revert #6.

Both UCD-SNMP-MIB::memTotalReal.0 (.1.3.6.1.4.1.2021.4.5.0) and HOST-RESOURCES-MIB::hrMemorySize.0 (.1.3.6.1.2.1.25.2.2.0) OIDs give the same value on all my host, so I think you can keep using memTotalReal for consistency with the other ones and only revert the bad calculations.

netniV commented 4 years ago

I have now reverted c3feb030e865c3da1ec0f76bdf5b3f84d1a2a3ec so that 29a27a0 should show as expected. Unless you also want memCached deleting not adding? But that seems to be the bug with his change.

nuno-silva commented 4 years ago

29a27a0 looks good. Thanks!