openshift / installer

Install an OpenShift 4.x cluster
https://try.openshift.com
Apache License 2.0
1.42k stars 1.38k forks source link

Baremetal IPI install failed #4542

Closed zhouhao3 closed 3 years ago

zhouhao3 commented 3 years ago

Version

$ openshift-baremetal-install version
openshift-baremetal-install 4.7.0-0.nightly-2021-01-12-203716
built from commit b3dae7f4736bcd1dbf5a1e0ddafa826ee1738d81
release image quay.io/openshift-release-dev/ocp-release-nightly@sha256:c97466158d19a6e6b5563da4365d42ebe5579421b1163f3a2d6778ceb5388aed   

Platform:

baremetal IPI

What happened?

I am trying to deploy IPI on irmc servers. Currently there is no place to specify the inspect_interface of bare metal machines. Each node's inspect_interface will be set to the default (ironic inspector), but for irmc server, the inspect_interface need to be irmc. So after nodes were created in Ironic, I manually changed the inspect_interface from inspector to irmc and started an inspection using irmc. Then this error occured:

ERROR ironic.conductor.manager AttributeError: type object 'Session' has no attribute 'bmc_hansdlers'

This seems like just a typo.

The relevant log is as follows:

2021-01-13 04:34:37.835 1 DEBUG ironic.conductor.task_manager [-] Successfully released exclusive lock for heartbeat on node 1e69db1a-f9f4-47fe-8617-8868d26c6773 (lock was held 0.07 sec) release_resources /usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py:378^[[00m
2021-01-13 04:35:08.545 1 DEBUG ironic.common.states [req-b2dc63a4-a4e5-42ef-a90f-a48d1087ebe0 bootstrap-user - - - -] Exiting old state 'inspecting' in response to event 'fail' on_exit /usr/lib/python3.6/site-packages/ironic/common/states.py:295^[[00m
2021-01-13 04:35:08.546 1 DEBUG ironic.common.states [req-b2dc63a4-a4e5-42ef-a90f-a48d1087ebe0 bootstrap-user - - - -] Entering new state 'inspect failed' in response to event 'fail' on_enter /usr/lib/python3.6/site-packages/ironic/common/states.py:301^[[00m
2021-01-13 04:35:08.630 1 ERROR ironic.conductor.task_manager [req-b2dc63a4-a4e5-42ef-a90f-a48d1087ebe0 bootstrap-user - - - -] Node 387f926b-9834-4e08-b0dd-485b16ff7573 moved to provision state "inspect failed" from state "inspecting"; target provision state is "manageable"^[[00m
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager [req-b2dc63a4-a4e5-42ef-a90f-a48d1087ebe0 bootstrap-user - - - -] Failed to inspect node 387f926b-9834-4e08-b0dd-485b16ff7573: Unexpected exception of type AttributeError: type object 'Session' has no attribute 'bmc_hansdlers': AttributeError: type object 'Session' has no attribute 'bmc_hansdlers'
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager Traceback (most recent call last):
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/ironic/conductor/manager.py", line 3741, in _do_inspect_hardware
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     new_state = task.driver.inspect.inspect_hardware(task)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/ironic_lib/metrics.py", line 59, in wrapped
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     result = f(*args, **kwargs)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/irmc/inspect.py", line 267, in inspect_hardware
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     **kwargs)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/ironic/drivers/modules/irmc/inspect.py", line 167, in _inspect_hardware
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     **kwargs)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/scciclient/irmc/scci.py", line 586, in get_capabilities_properties
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     v['trusted_boot'] = ipmi.get_tpm_status(d_info)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/scciclient/irmc/ipmi.py", line 95, in get_tpm_status
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     password=d_info['irmc_password'])
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/command.py", line 146, in __init__
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     kg=kg)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 515, in __init__
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     Session.wait_for_rsp()
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 1150, in wait_for_rsp
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     session._keepalive()
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 1205, in _keepalive
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     self.raw_command(netfn=6, command=1, waitall=True)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 766, in raw_command
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     self.awaitresponse(retry, waitall)
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 732, in awaitresponse
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     self._timedout()
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 1629, in _timedout
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     self._mark_broken()
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager   File "/usr/lib/python3.6/site-packages/pyghmi/ipmi/private/session.py", line 550, in _mark_broken
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager     myport in Session.bmc_hansdlers[sockaddr]):
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager AttributeError: type object 'Session' has no attribute 'bmc_hansdlers'
2021-01-13 04:35:08.632 1 ERROR ironic.conductor.manager ^[[00m
2021-01-13 04:35:08.755 1 DEBUG ironic.conductor.task_manager [req-b2dc63a4-a4e5-42ef-a90f-a48d1087ebe0 bootstrap-user - - - -] Successfully released exclusive lock for hardware inspection on node 387f926b-9834-4e08-b0dd-485b16ff7573 (lock was held 39.47 sec) release_resources /usr/lib/python3.6/site-packages/ironic/conductor/task_manager.py:378^[[00m
2021-01-13 04:35:14.213 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._sync_local_state' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.437 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.drivers.modules.pxe_base.PXEBaseMixin._check_boot_timeouts' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.447 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.drivers.modules.pxe_base.PXEBaseMixin._check_boot_timeouts' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.565 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_cleanwait_timeouts' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.592 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_deploy_timeouts' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.599 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_inspect_wait_timeouts' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m
2021-01-13 04:35:14.613 1 DEBUG futurist.periodics [-] Submitting periodic callback 'ironic.conductor.manager.ConductorManager._check_orphan_allocations' _process_scheduled /usr/lib/python3.6/site-packages/futurist/periodics.py:639^[[00m

What you expected to happen?

inspect success

How to reproduce it (as minimally and precisely as possible)?

$ mkdir clusterconfigs

$ cp ipi/install-config.yaml clusterconfigs                  

$ openshift-baremetal-install --dir ~/clusterconfigs create manifests  

$ cp ~/ipi/99_router-replicas.yaml ~/clusterconfigs/openshift

$ openshift-baremetal-install --dir ~/clusterconfigs --log-level debug create cluster

After ironic starts to create node in bootstrap VM, manually set inspect_interface to irmc

Anything else we need to know?

Introduction to the setup environment: Bare metal uses irmc server, with only one batemetal network, 3 master nodes + 1 worker node.

The main information of install-config.yaml is as follows:

apiVersion: v1
baseDomain: zz.local
metadata:
  name: openshift
networking:
  machineCIDR: 192.168.66.0/24
  networkType: OVNKubernetes
compute:
- name: worker
  replicas: 1
controlPlane:
  name: master
  replicas: 3
  platform:
    baremetal: {}
platform:
  baremetal:
    apiVIP: 192.168.66.201
    ingressVIP: 192.168.66.202
    provisioningNetwork: "Disabled"
    provisioningHostIP: 192.168.66.252
    bootstrapProvisioningIP: 192.168.66.251
    hosts:
      - name: openshift-master-0
        role: master
        bmc:
          address: irmc://192.168.*.*
          username: username
          password: password

ironic-conductor image version:

quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f8b668f6a69ed481b35523966cf5ed8df8b151d22d62c6d65892dd13f36badf4

ironic-api image version:

quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:f8b668f6a69ed481b35523966cf5ed8df8b151d22d62c6d65892dd13f36badf4

References

staebler commented 3 years ago

/label platform/baremetal

rht-jniu commented 3 years ago

It might be a known issue of pyghmi. https://opendev.org/x/pyghmi/commit/01ad6710c40be954b0da11fe24b1db31c05797e3

zhouhao3 commented 3 years ago

close since https://github.com/openshift/ironic-image/pull/137 fixed it.