canonical / hardware-observer-operator

A charm to setup prometheus exporter for IPMI, RedFish and RAID devices from different vendors.
Apache License 2.0
7 stars 14 forks source link

hook failed: "install" #156

Closed dstathis closed 1 month ago

dstathis commented 5 months ago

When I deploy using my setup script, the hardware observer unit related to zookeeper is always in error.

dylan@protostar:~/repos/juju-dev-machine$ juju status --relations
Model    Controller  Cloud/Region         Version  SLA          Timestamp
machine  machine     localhost/localhost  3.3.1    unsupported  12:01:54Z

SAAS        Status  Store  URL
grafana     active  k8s    admin/lma.grafana-dashboards
loki        active  k8s    admin/lma.loki-logging
prometheus  active  k8s    admin/lma.prometheus-receive-remote-write

App        Version  Status  Scale  Charm              Channel  Rev  Exposed  Message
agent               active      2  grafana-agent      edge      37  no       
cp         n/a      active      1  cos-proxy          edge      58  no       
hob                 error       2  hardware-observer  edge      29  no       hook failed: "install"
kafka               active      1  kafka              3/edge   149  no       machine system settings are not optimal - see logs for info
zookeeper           active      1  zookeeper          3/edge   117  no       

Unit          Workload  Agent  Machine  Public address  Ports  Message
cp/1*         active    idle   3        10.94.42.104           
kafka/0*      active    idle   1        10.94.42.227           machine system settings are not optimal - see logs for info
  agent/1     active    idle            10.94.42.227           
  hob/1       active    idle            10.94.42.227           Unit is ready
zookeeper/0*  active    idle   0        10.94.42.239           
  agent/0*    active    idle            10.94.42.239           
  hob/0*      error     idle            10.94.42.239           hook failed: "install"

Machine  State    Address       Inst id        Base          AZ  Message
0        started  10.94.42.239  juju-18bc5f-0  ubuntu@22.04      Running
1        started  10.94.42.227  juju-18bc5f-1  ubuntu@22.04      Running
3        started  10.94.42.104  juju-18bc5f-3  ubuntu@22.04      Running

Integration provider               Requirer                   Interface                Type         Message
agent:grafana-dashboards-provider  grafana:grafana-dashboard  grafana_dashboard        regular      
agent:peers                        agent:peers                grafana_agent_replica    peer         
hob:cos-agent                      agent:cos-agent            cos_agent                subordinate  
kafka:cluster                      kafka:cluster              cluster                  peer         
kafka:cos-agent                    agent:cos-agent            cos_agent                subordinate  
kafka:juju-info                    hob:general-info           juju-info                subordinate  
kafka:restart                      kafka:restart              rolling_op               peer         
kafka:upgrade                      kafka:upgrade              upgrade                  peer         
loki:logging                       agent:logging-consumer     loki_push_api            regular      
prometheus:receive-remote-write    agent:send-remote-write    prometheus_remote_write  regular      
zookeeper:cluster                  zookeeper:cluster          cluster                  peer         
zookeeper:cos-agent                agent:cos-agent            cos_agent                subordinate  
zookeeper:juju-info                hob:general-info           juju-info                subordinate  
zookeeper:restart                  zookeeper:restart          rolling_op               peer         
zookeeper:upgrade                  zookeeper:upgrade          upgrade                  peer         
zookeeper:zookeeper                kafka:zookeeper            zookeeper                regular

The error is due to a failure to install ipmitools as can be seen in the logs here:

unit-hob-0: 12:02:18 ERROR unit.hob/0.juju-log Uncaught exception while in charm code:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-hob-0/charm/./src/charm.py", line 294, in <module>
    ops.main(HardwareObserverCharm)  # type: ignore
  File "/var/lib/juju/agents/unit-hob-0/charm/venv/ops/main.py", line 451, in __call__
    return main(charm_class, use_juju_for_storage=use_juju_for_storage)
  File "/var/lib/juju/agents/unit-hob-0/charm/venv/ops/main.py", line 434, in main
    framework.reemit()
  File "/var/lib/juju/agents/unit-hob-0/charm/venv/ops/framework.py", line 863, in reemit
    self._reemit()
  File "/var/lib/juju/agents/unit-hob-0/charm/venv/ops/framework.py", line 942, in _reemit
    custom_handler(event)
  File "/var/lib/juju/agents/unit-hob-0/charm/./src/charm.py", line 70, in _on_install_or_upgrade
    resource_installed, msg = self.hw_tool_helper.install(self.model.resources)
  File "/var/lib/juju/agents/unit-hob-0/charm/src/hw_tools.py", line 533, in install
    hw_white_list = get_hw_tool_white_list()
  File "/var/lib/juju/agents/unit-hob-0/charm/src/hw_tools.py", line 472, in get_hw_tool_white_list
    bmc_white_list = bmc_hw_verifier()
  File "/var/lib/juju/agents/unit-hob-0/charm/src/hw_tools.py", line 459, in bmc_hw_verifier
    if redfish_available():
  File "/var/lib/juju/agents/unit-hob-0/charm/src/hw_tools.py", line 398, in redfish_available
    bmc_address = get_bmc_address()
  File "/var/lib/juju/agents/unit-hob-0/charm/src/hardware.py", line 58, in get_bmc_address
    apt.add_package("ipmitool", update_cache=False)
  File "/var/lib/juju/agents/unit-hob-0/charm/lib/charms/operator_libs_linux/v0/apt.py", line 761, in add_package
    pkg, success = _add(p, version, arch)
  File "/var/lib/juju/agents/unit-hob-0/charm/lib/charms/operator_libs_linux/v0/apt.py", line 802, in _add
    pkg.ensure(state=PackageState.Present)
  File "/var/lib/juju/agents/unit-hob-0/charm/lib/charms/operator_libs_linux/v0/apt.py", line 289, in ensure
    self._add()
  File "/var/lib/juju/agents/unit-hob-0/charm/lib/charms/operator_libs_linux/v0/apt.py", line 261, in _add
    self._apt(
  File "/var/lib/juju/agents/unit-hob-0/charm/lib/charms/operator_libs_linux/v0/apt.py", line 255, in _apt
    raise PackageError(
charms.operator_libs_linux.v0.apt.PackageError: Could not install package(s) [['ipmitool=1.8.18-11ubuntu2.1']]: None
unit-hob-0: 12:02:18 ERROR juju.worker.uniter.operation hook "install" (via hook dispatching script: dispatch) failed: exit status 1

When I juju ssh in to the unit and try to install ipmitool manually I see an error as well:

ubuntu@juju-18bc5f-0:~$ sudo apt update
Hit:1 http://archive.ubuntu.com/ubuntu jammy InRelease
Hit:2 http://archive.ubuntu.com/ubuntu jammy-updates InRelease
Hit:3 http://archive.ubuntu.com/ubuntu jammy-backports InRelease
Hit:4 http://security.ubuntu.com/ubuntu jammy-security InRelease
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
All packages are up to date.
ubuntu@juju-18bc5f-0:~$ sudo apt install ipmitool
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
The following additional packages will be installed:
  libopenipmi0 libsensors-config libsensors5 libsnmp-base libsnmp40 openipmi
Suggested packages:
  lm-sensors snmp-mibs-downloader
The following NEW packages will be installed:
  ipmitool libopenipmi0 libsensors-config libsensors5 libsnmp-base libsnmp40 openipmi
0 upgraded, 7 newly installed, 0 to remove and 0 not upgraded.
1 not fully installed or removed.
Need to get 0 B/2402 kB of archives.
After this operation, 8485 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Setting up install-info (6.8-4build1) ...
/usr/sbin/update-info-dir: 2: /etc/environment: -Dzookeeper.requireClientAuthScheme=sasl: not found
dpkg: error processing package install-info (--configure):
 installed install-info package post-installation script subprocess returned error exit status 127
Errors were encountered while processing:
 install-info
needrestart is being skipped since dpkg has failed
E: Sub-process /usr/bin/dpkg returned an error code (1)
Pjack commented 1 month ago

Does that still happen with the latest version?

Pjack commented 1 month ago

Welcome to reopen it if you still encounter the same issue in latest version. thanks!