Open Xarno opened 1 year ago
I found a workaround when I add the swap size to the the memory size before sending it to the elastic beanstalk service.
/opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.6.0/gems/healthd-sysstat-1.0.3-universal-linux/lib/healthd-sysstat/plugin.rb -> see "Change Start"
require 'healthd/daemon/plugins/fixed_interval_base'
require 'healthd/daemon/logger'
require 'executor'
module Healthd
module Plugins
module Sysstat
class Plugin < Daemon::Plugins::FixedIntervalBase
namespace 'system'
include Executor
@@loadavg_path = '/proc/loadavg'
@@stat_path = '/proc/stat'
@@meminfo_path = '/proc/meminfo'
@@cpuinfo_path = '/proc/cpuinfo'
@@diskspace_refresh_after = 300
@@cpuinfo_regexp = /^processor\s*:/
@@meminfo_regexp = /([\w\(\)]+):\s+([0-9]+)/
@@meminfo_keys = {
'MemTotal' => 'mem_total',
'MemAvailable' => 'mem_available',
'MemFree' => 'mem_free',
'Buffers' => 'buffers',
'Cached' => 'cached',
'SwapCached' => 'swap_cached',
'SwapTotal' => 'swap_total',
'SwapFree' => 'swap_free'
}
@@pid_name_regexp = /.*\/(.*)\.pid$/
def setup
# initialize cpu_usage
cpu_usage
end
def snapshot
data = {}
data = loadavg data
data = cpu_usage data
data = disk_space data
data = meminfo data
data = processor_count data
data = pids data
data
end
private
def loadavg(data={})
h = {}
h['1'],
h['5'],
h['15'] = File.read(@@loadavg_path).split.first(3).collect { |i| i.to_f.round 2 }
data['loadavg'] = h
data
end
private
def cpu_usage(data={})
h = {}
h['user'],
h['nice'],
h['system'],
h['idle'],
h['iowait'],
h['irq'],
h['softirq'] = File.read(@@stat_path).each_line.first.split.drop(1).collect(&:to_i)
delta = h.merge @cpu_usage do |key, current, previous|
current - previous
end if @cpu_usage
@cpu_usage = h
data['cpu_usage'] = delta
data
end
private
def disk_space(data={})
@diskspace_at ||= Time.at 0
@diskspace ||= nil
if Time.now - @diskspace_at > @@diskspace_refresh_after
if stats = fs_stats
@diskspace_at = Time.now
@diskspace = stats
end
end
raise "diskspace statistics not available" unless @diskspace
data['disk_space'] = { '/' => @diskspace }
data
end
private
def fs_stats
output = sh %[stat --file-system --format "%s %b %a" /]
h = {}
h['block_size'],
h['block_count'],
h['free_blocks'] = output.split.collect(&:to_i)
if h.values.count(&:itself) != 3
Daemon::Logger.warn "invalid filesystem statistics. output: #{output}"
nil
else
h
end
rescue Executor::NonZeroExitStatus => e
Daemon::Logger.warn "could not fetch filesystem statistics. exit status: #{e.exit_code}, message: #{e.message}"
nil
end
private
def meminfo(data={})
raw = File.read(@@meminfo_path)
h = raw.each_line.first(20).inject({}) do |h, line|
_, key, value = line.match(@@meminfo_regexp).to_a
value = value.to_i
h[@@meminfo_keys[key]] = value if @@meminfo_keys.include? key
h
end
# << Change Start >>
#Trick Elastic Beanstalk to count swap as mem
h['mem_total'] = h['mem_total'] + h['swap_total']
h['mem_available'] = h['mem_available'] + h['swap_free']
h['mem_free'] = h['mem_free'] + h['swap_free']
# << Change End >>
data['meminfo'] = h
data
end
private
def processor_count(data={})
@processor_count ||= begin
cpuinfo = File.read(@@cpuinfo_path) rescue nil
count = cpuinfo.scan(@@cpuinfo_regexp).count
count if count > 0
end
data['processor_count'] = @processor_count if @processor_count
data
end
private
def pids(data={})
@pid_name_cache ||= {}
h = Dir.glob("#{options.beanstalk_base_path}/*.pid").inject({}) do |h, path|
name = @pid_name_cache[path]
unless name
name = path[@@pid_name_regexp, 1]
@pid_name_cache[path] = name
end
h[name] = running? path
h
end
data['service_status'] = h
data
end
private
def running?(path)
pid = if File.exists? path
contents = File.read(path)
return false if contents.empty?
contents.to_i
end
case
when pid && ( Process.getpgid pid rescue nil )
true
when pid
false
else
nil
end
end
end
end
end
end```
Its the same problem for the new ZRAM feature: https://docs.aws.amazon.com/linux/al2023/release-notes/relnotes-2023.2.20230920.html
I get a 97% Memory in Use warning when actually I have nearly 50% space in the ZRAM Swap.
top - 15:55:08 up 15:46, 0 users, load average: 0.16, 0.29, 0.33
Tasks: 146 total, 2 running, 144 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.7 us, 0.2 sy, 0.0 ni, 99.0 id, 0.2 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 1847.3 total, 57.3 free, 1672.8 used, 117.3 buff/cache
MiB Swap: 1847.0 total, 869.2 free, 977.8 used. 33.7 avail Mem
sh-5.2$ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
zram0 252:0 0 1.8G 0 disk [SWAP]
nvme0n1 259:0 0 8G 0 disk
├─nvme0n1p1 259:1 0 8G 0 part /
└─nvme0n1p128 259:2 0 10M 0 part /boot/efi
sh-5.2$ swapon
NAME TYPE SIZE USED PRIO
/dev/zram0 partition 1.8G 987.2M 100
sh-5.2$ zramctl
NAME ALGORITHM DISKSIZE DATA COMPR TOTAL STREAMS MOUNTPOINT
/dev/zram0 lzo-rle 1.8G 949.9M 439.6M 458.3M 2 [SWAP]
Community Note
Tell us about your request Currently if I setup swap space it is not taken into account when reporting high memory usage. The status of the Instance goes to:
But in reality it still has 50% space in Swap, like so:
If the system has swap configured the memory warning should use that too. Also the Status Level should not be Degraded but Info at most. If Memory and Swap is full then Status Level should be Degraded.
Supporting argument: If the system / swap activities cause the disk IO pool to be depleted there is already another warning in place. So you would see when the system is actually Degraded.
Is this request specific to an Elastic Beanstalk platform? No
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard? I want to use the instance to it's fullest. So I give my app a huge portion of the ram (3,3GB out of 4GB in this case). That leaves the host os with ~700MB. And to not get bitten by the kernel OOM Killer I set up a Swap Partition to let the kernel do it's memory allocation stuff.
Are you currently working around this issue? No Workaround known.
/opt/elasticbeanstalk/lib/ruby/lib/ruby/gems/2.6.0/gems/healthd-sysstat-1.0.3-universal-linux/lib/healthd-sysstat/plugin.rb Already reports swap space to the elastic beanstalk service but I could not get warning about full swap, even when tried with https://unix.stackexchange.com/a/254976/525725