sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
718 stars 1.38k forks source link

[SmartSwitch] monit container_checker is failing when there are dedicated DPU databases #19295

Open vivekrnv opened 2 months ago

vivekrnv commented 2 months ago

Description

Enable DPU database containers on NPU side of Smartswitch and monit will start throwing error

Steps to reproduce the issue:

root@r-smartswitch-01:/home/admin# monit status
Monit 5.20.0 uptime: 2h 40m

Program 'container_checker'
  status                       Status failed
  monitoring status            Monitored
  monitoring mode              active
  on reboot                    start
  last exit value              4
  last output                  Failed to get image 'docker-sonic-telemetry'. Error: '404 Client Error for http+docker://localhost/v1.43/images/docker-sonic-telemetry/json: Not Found ("No such image: docker-sonic-telemetry:latest")'
                               Unexpected running containers: databasedpu0, databasedpu1, databasedpu3, databasedpu2
  data collected               Thu, 13 Jun 2024 03:38:32

container_check must be updated to handle these databases

vivekrnv commented 2 months ago

@prgeor @Pterosaur Please take a look

prgeor commented 2 months ago

@prgeor @Pterosaur Please take a look

ack