timdaman / check_docker

Nagios plugin to check docker containers
GNU General Public License v3.0
152 stars 60 forks source link

DockerEngine 20.10.10: KeyError: 'total_cache' #81

Open jensritter opened 2 years ago

jensritter commented 2 years ago

After updating from Debian buster(10) to bullseye(11) I am getting KeyError: 'total_cache'.

installed Docker : 5:20.10.10~3-0~debian-buster installed check_docker: 2.2.2

Exception: "check_docker", line 541, in check_memory adjusted_usage = inspection['memory_stats']['usage'] - inspection['memory_stats']['stats']['total_cache'] KeyError: 'total_cache' UNKNOWN: Exception raised during check': KeyError('total_cache')

inspection.json.gz

My current "fix" is to skip the adjusted_usage, and only use inspection['memory_stats']['usage']

hessijames79 commented 2 years ago

Same here :/ Seems to be related to this issue: https://github.com/hashicorp/nomad/issues/10251

hessijames79 commented 2 years ago

Just saw there is a pull request: https://github.com/timdaman/check_docker/pull/82

blue212121 commented 1 year ago

dirty fix with limited/no testing for Docker version 20.10.12, build 20.10.12-0ubuntu4:

change 'total_cache' to 'inactive_file' in line 523, result seems to be the same

before:

   510  # Checks
   511  #############################################################################################
   512
   513  @multithread_execution()
   514  @require_running(name='memory')
   515  def check_memory(container, thresholds):
   516      if not thresholds.units in unit_adjustments:
   517          unknown("Memory units must be one of  {}".format(list(unit_adjustments.keys())))
   518          return
   519
   520      inspection = get_stats(container)
   521
   522      # Subtracting cache to match what `docker stats` does.
   523      adjusted_usage = inspection['memory_stats']['usage'] - inspection['memory_stats']['stats']['total_cache']
   524      if thresholds.units == '%':
   525          max = 100
   526          usage = int(100 * adjusted_usage / inspection['memory_stats']['limit'])
   527      else:
   528          max = inspection['memory_stats']['limit'] / unit_adjustments[thresholds.units]
   529          usage = adjusted_usage / unit_adjustments[thresholds.units]
   530
   531      evaluate_numeric_thresholds(container=container, value=usage, thresholds=thresholds, name='memory',
   532                                  short_name='mem', min=0, max=max)
   533
   534

after:

   510  # Checks
   511  #############################################################################################
   512
   513  @multithread_execution()
   514  @require_running(name='memory')
   515  def check_memory(container, thresholds):
   516      if not thresholds.units in unit_adjustments:
   517          unknown("Memory units must be one of  {}".format(list(unit_adjustments.keys())))
   518          return
   519
   520      inspection = get_stats(container)
   521
   522      # Subtracting cache to match what `docker stats` does.
   523      adjusted_usage = inspection['memory_stats']['usage'] - inspection['memory_stats']['stats']['inactive_file']
   524      if thresholds.units == '%':
   525          max = 100
   526          usage = int(100 * adjusted_usage / inspection['memory_stats']['limit'])
   527      else:
   528          max = inspection['memory_stats']['limit'] / unit_adjustments[thresholds.units]
   529          usage = adjusted_usage / unit_adjustments[thresholds.units]
   530
   531      evaluate_numeric_thresholds(container=container, value=usage, thresholds=thresholds, name='memory',
   532                                  short_name='mem', min=0, max=max)
   533
   534
jensritter commented 1 year ago

This looks good.

Here some test-values from my System with 5:20.10.6~3-0~debian-bullseye:

"docker stats": CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS 563e5e2ba213 sonar 1.35% 3.054GiB / 13.66GiB 22.35% 8GB / 8.1GB 730MB / 7.39GB 241

'memory_stats': { 'usage': 3598192640, 'stats': { 'active_anon': 0, 'active_file': 301445120, 'anon': 2934972416, 'anon_thp': 2768240640, 'file': 621096960, 'file_dirty': 0, 'file_mapped': 41631744, 'file_writeback': 0, 'inactive_anon': 2940768256, 'inactive_file': 319369216, 'kernel_stack': 3932160, 'pgactivate': 122199, 'pgdeactivate': 1008, 'pgfault': 3038112, 'pglazyfree': 0, 'pglazyfreed': 0, 'pgmajfault': 594, 'pgrefill': 1437, 'pgscan': 39214, 'pgsteal': 27354, 'shmem': 0, 'slab': 23162272, 'slab_reclaimable': 20154376, 'slab_unreclaimable': 3007896, 'sock': 114688, 'thp_collapse_alloc': 1089, 'thp_fault_alloc': 2442, 'unevictable': 0, 'workingset_activate': 0, 'workingset_nodereclaim': 0, 'workingset_refault': 0 }, 'limit': 14671679488 }