OpenCHAMI / bss

MIT License
1 stars 2 forks source link

[BUG] BSS can't find MAC when making a bootscripts request #29

Open travisbcotton opened 2 months ago

travisbcotton commented 2 months ago

Running the noauth versions of the ochami services.

Trying to figure out why a MAC address is showing as unknown in BSS

Try to get bootscript of MAC:

#curl http://ochami-vm:27788/boot/v1/bootscript?mac=a4:bf:01:51:e3:5f
#!ipxe
sleep 10
chain https://api-gw-service-nmn.local/apis/bss/boot/v1/bootscript?mac=a4:bf:01:51:e3:5f&arch=${buildarch}&ts=1714073914

but it returns the retry ipxe script.

it has a ComponentEndpoint:

#curl -s http://ochami-vm:27799/hsm/v2/Inventory/ComponentEndpoints/x1008c0s0b35 | jq
{
  "ID": "x1008c0s0b35",
  "Type": "Node",
  "RedfishType": "ComputerSystem",
  "RedfishSubtype": "Physical",
  "UUID": "9745ca2b-2205-403d-aeaa-b7ab195e7c3b",
  "OdataID": "",
  "RedfishEndpointID": "x1008c0s0b35",
  "Enabled": true,
  "RedfishEndpointFQDN": "ba396",
  "RedfishURL": "ba396",
  "ComponentEndpointType": "ComponentEndpointComputerSystem",
  "RedfishSystemInfo": {
    "EthernetNICInfo": [
      {
        "RedfishId": "ba396",
        "@odata.id": "/redfish/v1/Systems/Self/EthernetInterfaces/ba396",
        "InterfaceEnabled": true,
        "MACAddress": "a4:bf:01:51:e3:5f"
      }
    ]
  }
}

It has a RedfishEndpoint

#curl -s http://ochami-vm:27799/hsm/v2/Inventory/RedfishEndpoints/x1008c0s0b35 | jq
{
  "ID": "x1008c0s0b35",
  "Type": "NodeBMC",
  "Name": "ba396",
  "Hostname": "ba396",
  "Domain": "",
  "FQDN": "ba396",
  "Enabled": true,
  "User": "root",
  "Password": "",
  "MACRequired": true,
  "IPAddress": "192.168.10.24",
  "RediscoverOnUpdate": false,
  "DiscoveryInfo": {
    "LastDiscoveryStatus": "NotYetQueried"
  }
}

It has a Component entry for the Node and NodeBMC

    {
      "ID": "x1008c0s0b35n0",
      "Type": "Node",
      "State": "Ready",
      "Flag": "OK",
      "Enabled": true,
      "Role": "Compute",
      "NID": 396,
      "Arch": "X86"
    },
    {
      "ID": "x1008c0s0b35",
      "Type": "Node",
      "Enabled": true
    }

It also has an EthernetInterface entry:

  {
    "ID": "a4bf0151e35f",
    "Description": "Interface for ba396",
    "MACAddress": "a4:bf:01:51:e3:5f",
    "LastUpdate": "2024-04-25T16:58:15.285244Z",
    "ComponentID": "x1008c0s0b35n0",
    "Type": "Node",
    "IPAddresses": [
      {
        "IPAddress": "192.168.6.24",
        "Network": "NMN"
      }
    ]
  },

Here are BSS logs of the request:

2024/04/25 18:58:20 [bss-noauth/xAVYWOqEU5-005762] "GET http://192.168.7.252:27788/boot/v1/bootscript?mac=a4:bf:01:51:e3:5f HTTP/1.1" from 192.168.7.253:35628 - 200 138B in 1.869537ms
2024/04/25 18:58:20 BSS request delayed for Unknown MAC a4:bf:01:51:e3:5f (x1008c0s0b35n0) while updating state
2024/04/25 18:58:20 Attempting connection to http://smd-noauth:27779/hsm/v2/Inventory/RedfishEndpoints (attempt 1/10)
2024/04/25 18:58:20 Connected to http://smd-noauth:27779/hsm/v2/Inventory/RedfishEndpoints on attempt 1
2024/04/25 18:58:20 Retrieving state info from http://smd-noauth:27779/hsm/v2
2024/04/25 18:58:21 ERROR sending POST to hmnfd http://cray-hmnfd/hmi/v1/subscribe: Post "http://cray-hmnfd/hmi/v1/subscribe": dial tcp: lookup cray-hmnfd on 127.0.0.11:53: no such host

Here are SMD logs of the request:

172.18.0.4 - - [25/Apr/2024:18:59:42 +0000] "GET /hsm/v2/Inventory/RedfishEndpoints HTTP/1.1" 200 179910 "" "Go-http-client/1.1"
2024/04/25 18:59:42 [smd-noauth/3dtRpnEDQY-007753] "GET http://smd-noauth:27779/hsm/v2/Inventory/RedfishEndpoints HTTP/1.1" from 172.18.0.4:44876 - 200 179910B in 6.651018ms
172.18.0.4 - - [25/Apr/2024:18:59:42 +0000] "GET /hsm/v2/State/Components?type=Node HTTP/1.1" 200 113129 "" "bss-noauth"
2024/04/25 18:59:42 [smd-noauth/3dtRpnEDQY-007754] "GET http://smd-noauth:27779/hsm/v2/State/Components?type=Node HTTP/1.1" from 172.18.0.4:44892 - 200 113129B in 6.070141ms
172.18.0.4 - - [25/Apr/2024:18:59:42 +0000] "GET /hsm/v2/Inventory/ComponentEndpoints?type=Node HTTP/1.1" 200 328405 "" "bss-noauth"
2024/04/25 18:59:42 [smd-noauth/3dtRpnEDQY-007755] "GET http://smd-noauth:27779/hsm/v2/Inventory/ComponentEndpoints?type=Node HTTP/1.1" from 172.18.0.4:44904 - 200 328405B in 9.091957ms
172.18.0.4 - - [25/Apr/2024:18:59:42 +0000] "GET /hsm/v2/Inventory/EthernetInterfaces?type=Node HTTP/1.1" 200 158407 "" "bss-noauth"
2024/04/25 18:59:42 [smd-noauth/3dtRpnEDQY-007756] "GET http://smd-noauth:27779/hsm/v2/Inventory/EthernetInterfaces?type=Node HTTP/1.1" from 172.18.0.4:44916 - 200 158407B in 4.949154ms

Let me know if there is any other logs/data I need to add to the above