metal3-io / baremetal-operator

Bare metal host provisioning integration for Kubernetes
Apache License 2.0
568 stars 247 forks source link

Operational history for BMH is not displayed #177

Closed yprokule closed 5 years ago

yprokule commented 5 years ago

Problem description

Events about BMH state transition is not displayed with resource describe:

oc describe baremetalhost discovered-node-0 -n openshift-machine-api
Name:         discovered-node-0
Namespace:    openshift-machine-api
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"metalkube.org/v1alpha1","kind":"BareMetalHost","metadata":{"annotations":{},"name":"discovered-node-0","namespace":"openshi...
API Version:  metalkube.org/v1alpha1
Kind:         BareMetalHost
Metadata:
  Creation Timestamp:  2019-05-03T09:19:31Z
  Finalizers:
    baremetalhost.metalkube.org
  Generation:        2
  Resource Version:  350682
  Self Link:         /apis/metalkube.org/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/discovered-node-0
  UID:               8f2d0fbb-6d84-11e9-b472-525400a4453e
Spec:
  Bmc:
    Address:           
    Credentials Name:  discovered-node-0-bmc-secret
  Boot MAC Address:    52:54:00:b7:e8:e8
  Hardware Profile:    
  Online:              true
Status:
  Error Message:  Empty BMC address Missing BMC connection detail 'Address'
  Good Credentials:
  Hardware Profile:    
  Last Updated:        2019-05-03T09:19:31Z
  Operational Status:  discovered
  Powered On:          false
  Provisioning:
    ID:  
    Image:
      Checksum:  
      URL:       
    State:       
Events:          <none>

While in baremetal-operator logs:

{"level":"info","ts":1556875171.6777682,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"openshift-machine-api","Request.Name":"discovered-node-0"}
{"level":"info","ts":1556875171.6778715,"logger":"baremetalhost","msg":"adding finalizer","Request.Namespace":"openshift-machine-api","Request.Name":"discovered-node-0","existingFinalizers":[],"newValue":"bareme
talhost.metalkube.org"}
{"level":"info","ts":1556875171.6889548,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"openshift-machine-api","Request.Name":"discovered-node-0"}
{"level":"info","ts":1556875171.6891785,"logger":"baremetalhost","msg":"updating owner of secret","Request.Namespace":"openshift-machine-api","Request.Name":"discovered-node-0"}
{"level":"info","ts":1556875171.7072365,"logger":"baremetalhost","msg":"publishing event","reason":"Discovered","message":"Discovered host with unusable BMC details: Empty BMC address Missing BMC connection deta
il 'Address'"}
{"level":"info","ts":1556875171.726009,"logger":"baremetalhost","msg":"Reconciling BareMetalHost","Request.Namespace":"openshift-machine-api","Request.Name":"discovered-node-0"}

Steps to reproduce

  1. Create BMH CR with missing BMC details, for e.g.:
    
    ---
    apiVersion: v1
    kind: Secret
    metadata:
    name: discovered-node-0-bmc-secret
    type: Opaque
    data:
    username: YWRtaW4=
    password: cGFzc3dvcmQ=

BMC address intentionally left empty to trigger transition

to 'Discovered' state


apiVersion: metalkube.org/v1alpha1 kind: BareMetalHost metadata: name: discovered-node-0 spec: online: true bmc: address: credentialsName: discovered-node-0-bmc-secret bootMACAddress: 52:54:00:b7:e8:e8


2. Realize those resources:

oc apply -f discovered_node_cr.yaml -n openshift-machine-api


3. Check resource's defintion

oc describe baremetalhost discovered-node-0 -n openshift-machine-api

yprokule commented 5 years ago

/cc @dhellmann @mhrivnak

yprokule commented 5 years ago

Interesting that for a another CR it displays event:

oc describe baremetalhosts external-node-0 -n openshift-machine-api
Name:         external-node-0
Namespace:    openshift-machine-api
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"metalkube.org/v1alpha1","kind":"BareMetalHost","metadata":{"annotations":{},"name":"external-node-0","namespace":"openshift...
API Version:  metalkube.org/v1alpha1
Kind:         BareMetalHost
Metadata:
  Creation Timestamp:  2019-05-03T09:39:54Z
  Finalizers:
    baremetalhost.metalkube.org
  Generation:        2
  Resource Version:  357611
  Self Link:         /apis/metalkube.org/v1alpha1/namespaces/openshift-machine-api/baremetalhosts/external-node-0
  UID:               68300177-6d87-11e9-910b-52540096c081
Spec:
  Bmc:
    Address:           ipmi://192.168.123.1:6230
    Credentials Name:  external-node-0-bmc-secret
  Boot MAC Address:    52:54:00:b6:e7:e7
  Hardware Profile:    
  Image:
    Checksum:  
    URL:       
  Machine Ref:
    Name:       external-node-0
    Namespace:  openshift-machine-api
  Online:       true
Status:
  Error Message:  
  Good Credentials:
    Credentials:
      Name:               external-node-0-bmc-secret
      Namespace:          openshift-machine-api
    Credentials Version:  357587
  Hardware:
    Cpus:
      Speed G Hz:  3
      Type:        x86
    Nics:
      Ip:          192.168.100.1
      Mac:         some:mac:address
      Model:       virt-io
      Name:        nic-1
      Network:     Pod Networking
      Speed Gbps:  1
      Ip:          192.168.100.2
      Mac:         some:other:mac:address
      Model:       e1000
      Name:        nic-2
      Network:     Pod Networking
      Speed Gbps:  1
    Ram Gi B:      128
    Storage:
      Model:           Dell CFJ61
      Name:            disk-1 (boot)
      Size Gi B:       95232
      Type:            SSD
      Model:           Dell CFJ61
      Name:            disk-2
      Size Gi B:       95232
      Type:            SSD
  Hardware Profile:    unknown
  Last Updated:        2019-05-03T09:39:56Z
  Operational Status:  OK
  Powered On:          true
  Provisioning:
    ID:  0d4073a1-28a3-498c-aa8e-1b51281f6ad2
    Image:
      Checksum:  
      URL:       
    State:       ready
Events:
  Type    Reason              Age    From                            Message
  ----    ------              ----   ----                            -------
  Normal  BMCAccessValidated  8m39s  metalkube-baremetal-controller  Verified access to BMC
  Normal  InspectionComplete  8m39s  metalkube-baremetal-controller  Hardware inspection completed
  Normal  ProfileSet          8m39s  metalkube-baremetal-controller  Hardware profile set: unknown
mhrivnak commented 5 years ago

The first example you showed has an error message "Empty BMC address Missing BMC connection detail 'Address'", and it does appear that the BMC.Address field is empty. The operator can't do anything until it has an address, so you're not seeing any events.

dhellmann commented 5 years ago

There should be an event for the discovered host. See https://github.com/metal3-io/baremetal-operator/blob/master/pkg/controller/baremetalhost/baremetalhost_controller.go#L228

I do see the log message from publishing the event. I've seen some errors about duplicate ids for those events, so I suspect I'm not constructing the object correctly and they aren't getting unique names/ids.

yprokule commented 5 years ago

The first example you showed has an error message "Empty BMC address Missing BMC connection detail 'Address'", and it does appear that the BMC.Address field is empty. The operator can't do anything until it has an address, so you're not seeing any events.

Based on this log entry I expected it would have an entry about discovering host with wrong credentials:

{"level":"info","ts":1556875171.7072365,"logger":"baremetalhost","msg":"publishing event","reason":"Discovered","message":"Discovered host with unusable BMC details: Empty BMC address Missing BMC connection detail 'Address'"}
dhellmann commented 5 years ago

The first example you showed has an error message "Empty BMC address Missing BMC connection detail 'Address'", and it does appear that the BMC.Address field is empty. The operator can't do anything until it has an address, so you're not seeing any events.

Based on this log entry I expected it would have an entry about discovering host with wrong credentials:

{"level":"info","ts":1556875171.7072365,"logger":"baremetalhost","msg":"publishing event","reason":"Discovered","message":"Discovered host with unusable BMC details: Empty BMC address Missing BMC connection detail 'Address'"}

Yep, exactly right.