Open allaire opened 6 years ago
On the previous version I compiled myself (with the fix of #148), when I run sudo inspeqtorctl status
I also see 100%, but no !
or alerts 🤔
I suspect you might be right; it's possible this is a coercion issue. I tried to fix a bunch of lint warnings and might have broke something in doing so. Can you supply a failing test?
@mperham the weird thing is that the previous version I have in production (https://github.com/mperham/inspeqtor/commit/42e9f59246cde2bbbce8eea8d9b82c29f04e6b2d) is also showing swap 100%
, but it's not in "alert mode" (!
). Two unrelated issues?
FWIW, sudo sysctl -n vm.swapusage
in my case returns sysctl: cannot stat /proc/sys/vm/swapusage: No such file or directory
.
agendrix@app-01:/proc/sys/vm$ ls
admin_reserve_kbytes laptop_mode oom_dump_tasks
block_dump legacy_va_layout oom_kill_allocating_task
compact_memory lowmem_reserve_ratio overcommit_kbytes
compact_unevictable_allowed max_map_count overcommit_memory
dirty_background_bytes memory_failure_early_kill overcommit_ratio
dirty_background_ratio memory_failure_recovery page-cluster
dirty_bytes min_free_kbytes panic_on_oom
dirty_expire_centisecs min_slab_ratio percpu_pagelist_fraction
dirty_ratio min_unmapped_ratio stat_interval
dirtytime_expire_seconds mmap_min_addr swappiness
dirty_writeback_centisecs nr_hugepages user_reserve_kbytes
drop_caches nr_hugepages_mempolicy vfs_cache_pressure
extfrag_threshold nr_overcommit_hugepages zone_reclaim_mode
hugepages_treat_as_movable nr_pdflush_threads
hugetlb_shm_group numa_zonelist_order
Ah, I don't have any swap configured (see first screenshot). Maybe inspeqtor is not handling the case where it can't read the swapusage
file correctly and detect it has 100%?
Inspeqtor reads the SwapFree and SwapTotal attributes in /proc/meminfo.
free := memMetrics["SwapFree"]
total := memMetrics["SwapTotal"]
if free == 0 {
hs.Save("swap", "", 100)
} else if free == total {
hs.Save("swap", "", 0)
} else {
hs.Save("swap", "", float64(100-int8(100*((free)/(total)))))
}
Wait, that's backwards. "swap" means "swap in use" and so your rule should trigger. If you don't have swap, you should remove the swap rule.
@mperham Oh well that explains the 100% then. I'm unsure why there's a discrepancy with the previous version about the alerting. I think if swap is disabled (showing 0 kb
), inspeqtor should handle it as Swap 0% instead of Swap 100%, no? Something like:
free := memMetrics["SwapFree"]
total := memMetrics["SwapTotal"]
if free == 0 && total != 0 {
hs.Save("swap", "", 100)
} else if free == total {
hs.Save("swap", "", 0)
} else {
hs.Save("swap", "", float64(100-int8(100*((free)/(total)))))
}
Bit off topic, do you recommend enabling Swap on app (puma) and worker (sidekiq) servers?
Thanks!
I see what you are saying. Hmm.
I'd recommend swap on every machine along with an alert if you ever use it. The alternative is the Linux OOM handler killing random processes.
We use stock Ubuntu images on AWS, and by default these machines have no swap. We use this when installing inspeqtor to disable the swap rule, if it's useful:
# We have no swap
sed -i "s/if swap/#if swap/" /etc/inspeqtor/host.inq
I just upgraded our staging environment to the latest Inspeqtor version (2.0) and swap is always 100% under inspeqtor, when in fact it's actually 0:
I suspect this commit could have broke something? https://github.com/mperham/inspeqtor/commit/41680d7528fd6a63bc885d5159fdb56e478d6cc7#diff-c187630e5b94e88c6486c75cccb5a092L25