This is a fork of the original SMD code from Cray-HPE/hms-smd, suitable only for experimentation and demo purposes at this point.
While the OpenCHAMI smd daemon is fundamentally the same as the original from HPE, it differs in a few ways:
/components
endpoint supports POSTIt still provides inventory management services for HPC systems based on BMC discovery and enumeration.
The rest of this README is unchanged from the HPE version.
The Shasta Hardware State Manager monitors and interrogates hardware components in a Shasta system, tracking hardware state and inventory information, and making it available via REST queries and message bus events when changes occur.
This service provides the following functions:
The main components of smd's RESTful API are as follows:
/hsm/v2/Inventory/RedfishEndpoints
POST A new RedfishEndpoint to be inventoried by state manager
GET The collection of all RedfishEndpoints, with optional filters.
/hsm/v2/Inventory/RedfishEndpoints/{xname-id}
PUT Updates to a RedfishEndpoint
GET The RedfishEndpoint's details or check its discovery status
DELETE A RedfishEndpoint that is no longer in the system
/hsm/v2/Inventory/ComponentsEndpoints?filter-option1=xxx...
GET Array of Redfish details for a filtered subset of components
/hsm/v2/Inventory/ComponentsEndpoints/{xname-id}
GET Redfish details for a specific component
/hsm/v2/State/Components?filter-option1=xxx...
GET A filtered subset of all components, or a specific component by its id
/hsm/v2/State/Components
POST A list of individual component ids to query, and filtering options.
/hsm/v2/State/Components/{xname-id}
GET The HW state, flag, role, enabled status, NID, etc. of the component
PATCH The HW state, flag, role, enabled status, NID, etc. of a component
/hsm/v2/State/Components/Query
POST A list of parents and filtering options
/hsm/v2/State/Components/Query/{parent-id}?filter-option1=xxx...
GET Parent and children of selected component parent id, optionally filtering
/hsm/v2/Inventory/Hardware/Query/all
GET An xthwinv-like json representation of the system's hardware and FRUs.
/hsm/v2/Inventory/HardwareByFRU/{fru-id}
GET Details on a particular FRU by it's ID.
/hsm/v2/Inventory/HardwareByFRU/{fru-id}
GET Details on a particular FRU by it's ID.
/hsm/v2/Defaults/NodeMaps
GET All NodeMaps entries, with default NID and Role per xname
POST One or more new NodeMaps entries to be added or overwritten
/hsm/v2/Defaults/NodeMaps/{xname}
GET The default NID and Role for {xname}
PUT Update the default NID and Role for xname {xname}
/hsm/v2/groups
GET Details on all groups
POST A new component group with a list of members
/hsm/v2/groups/{group-label}
PATCH Metadata for an existing group {group-label}
GET Details on the group {group-label}, i.e. it's members and metadata
/hsm/v2/groups/{group-label}/members
GET Just the member list for a group
POST The id of a new component to add to the group's members list
/hsm/v2/groups/{group-label}/members/{xname-id}
DELETE Component {xname-id} from the members of the group {group-label}
/hsm/v2/partitions
GET Details on all partitions
POST A new partition with a list of members
/hsm/v2/partitions/{part-name}
PATCH Metadata for an existing partition {part-name}
GET Details on the partition {part-name}, i.e. it's members and metadata
/hsm/v2/partitions/{part-name}/members
GET Just the members list for a partition
POST The id of a new component to add to the partition's members list
/hsm/v2/partitions/{part-name}/members/{xname-id}
DELETE Component {xname-id} from the members of partition {part-name}
/hsm/v2/memberships?filter-option1=xxx...
GET A filtered list of each system component's group/partition memberships
/hsm/v2/memberships/{xname-id}
GET The group and partition memberships (if any) of component {xname-id}
NOTE The above is NOT an exhausive list of the API calls and is intended solely as an overview
The complete HSM (smd) API documentation is included in the Cray API docs. This is the nightly-generated version. Content is generated in an automated fashion from the current swagger.yaml file.
http://web.us.cray.com/~ekoen/cray-portal/public
Latest detailed API usage examples:
https://github.com/OpenCHAMI/smd/blob/master/docs/examples.adoc (current)
Latest swagger.yaml (if you would prefer to use the OpenAPI viewer of your choice):
https://github.com/OpenCHAMI/smd/blob/master/api/swagger_v2.yaml (current)
In addition to the service itself, this repository builds and publishes cray-smd-test images containing tests that verify HSM on live Shasta systems. The tests are invoked via helm test as part of the Continuous Test (CT) framework during CSM installs and upgrades. The version of the cray-smd-test image (vX.Y.Z) should match the version of the cray-smd image being tested, both of which are specified in the helm chart for the service.
This is primarily intended to compare XC-Shasta functionality.
V1 Feature | V1+ Feature | XC Equivalent |
---|---|---|
/hsm/v2/State/Components (structure) | - | rs_node_t |
/hsm/v2/State/Components (GET) | - | xtcli status, but with all fields, including NIDs |
/hsm/v2/State/Components/Query/ |
- | xtcli status |
/hsm/v2/State/Components/ |
- | xtcli enable/disable |
/hsm/v2/State/Components/ |
- | xtcli mark |
/hsm/v2/State/Components/ |
- | xtcli set_empty -f |
/hsm/v2/State/Components/ |
- | xtcli set_flag/clr_flag |
/hsm/v2/Inventory/Hardware/Query/all (GET) | - | xthwinv s0 |
/hsm/v2/Inventory/Discover (POST XNames list) | - | xtdiscover --warmswap |
- | Additional Hardware/Query options | xthwinv |
SCN Events | - | ecnode(un)available, ec_node_failed |
- | See Future Features and Updates Below | - |
(1) Role determines Compute vs NCN (Non-Compute Node) type, not Compute vs. Service as on XC.
(2) Flags cleared automatically on successful state transition, so normally not needed.
(3) There is no direct equivalent to the full xtdiscover command on Shasta. Discovery is continuous in response to system events and works in concert with endpoint discovery performed by MEDS and REDS, as well as system info provided by (the upcoming) IDEALS.
Note that these are the States HSM directly has access to. They are basically just the hardware states, with the Ready, et. al states above On being tracked by the heartbeat monitor is the case of nodes (and in the case of controllers, by HSM directly confirming that a component can be accessed for Redfish operations). Other hardware types will generally go no higher than on.
A separate field, SoftwareStatus, is intended for any additional state that might exist for a heart beating node. Note that we have no table of these states, nor a transition diagram, because these are a function of the managed plane and we do not limit what can appear there so that there are no dependencies created.
StateUnknown HMSState = "Unknown" // The State is unknown. Appears missing but has not been confirmed as empty.
StateEmpty HMSState = "Empty" // The location is not populated with a component
StatePopulated HMSState = "Populated" // Present (not empty), but no further track can or is being done.
StateOff HMSState = "Off" // Present but powered off
StateOn HMSState = "On" // Powered on. If no heartbeat mechanism is available, it's software state may be unknown.
StateStandby HMSState = "Standby" // No longer Ready and presumed dead. It typically means HB has been lost (w/alert).
StateHalt HMSState = "Halt" // No longer Ready and halted. OS has been gracefully shutdown or panicked (w/ alert).
StateReady HMSState = "Ready" // Both On and Ready to provide its expected services, i.e. used for jobs.
To avoid undesirable behavior (bad ordering, invalid states), only certain state transitions are allowed based upon events or REST operations.
Note that the inventory discovery process has the ability to perform any state change, e.g. when a new component is added or is powered on after appearing to disappear from the system.
Desired new state - Required current state
"Unknown": {}, // Force/HSM-internal only
"Empty": {}, // Force/HSM-internal only
"Populated": {}, // Force/HSM-internal only
"Off": {StateOff, StateOn, StateStandby, StateHalt, StateReady},
"On": {StateOn, StateOff, StateStandby, StateHalt, StateReady},
"Standby": {StateStandby, StateReady},
"Halt": {StateHalt, StateReady},
"Ready": {StateReady, StateOn},
Setting default NIDs
This document describes and provides examples for the Hardware State Manager smd) NodeMaps feature, which allows the installer (or a user via the HSM REST API) to pre-populate default NID assignments for node locations in the system. These are then used a node with an xname matches the NodeMaps entry of the same name, setting the correct NID and Role values from the start, and making it unnecessary to manually patch them.
Groups (or Labels)
Are named sets of system components, most commonly nodes. Each component may belong to any number of groups. Groups can be created freely, and smd does not assign them any particular predetermined meaning.
Partitions
Are essentially a kind of group, but have an established meaning and are treated as distinct entities from groups. Each component may belong to at most one partition, and partitions are used as an access control mechanism.
hmcollector polls smd periodically and establishes event subscriptions when new RedfishEndpoints are found. These events are used for power state changes and they are POSTed to a kafka bus (currently the telemetry bus) that smd then monitors.
When an event comes in, smd establishes the sending BMC and then looks up (via the ComponentEndpoints) the path of the subcomponent URI referenced in the payload in order to establish, for example, which of the two nodes under a node controller is the one powering on or off.
The event payloads used can vary from Redfish implementation to implementation. In some cases, these are "Alert" type events that are more or less a repackaging of the underlying iPMI alert message. In other cases, the controller may use the standard ResourceEvent registry, where the intent is to report on Redfish Status field (and other) changes in a more generic way.
State Change Notification Infrastructure
Set via -e during docker run or in k8s configuration:
RF_MSG_HOST - Sets the kafkahost:port:topic
SMD_PROXY - socks5 proxy to use when interrogating Redfish endpoint IPs.
SMD_DBTYPE - Only option, and default if blank is "postgres"
SMD_DBNAME - Name of database to connect to (defaulr: hmsds)
SMD_DBUSER - DB Account to use (default: hmsdsuser)
SMD_DBHOST - DB Hostname (e.g. cray-smd-postgres in kubernetes)
SMD_DBPORT - DB Port (default 5432)
SMD_DBPASS - Password for SMD_DBUSER
SMD_DBOPTS - Additional DB parameters to append to connection DSN
LOGLEVEL - Set log level (0-4)
sudo docker run --rm --name cray-smd-postgres -e POSTGRES_PASSWORD=hmsdsuser -e POSTGRES_USER=hmsdsuser -e POSTGRES_DB=hmsds -d -p 5432:5432 postgres:10.8
sudo docker pull dtr.dev.cray.com:443/cray/cray-smd-init:latest
sudo docker run --name smd-init --link cray-smd-postgres:cray-smd-postgres -e SMD_DBHOST=cray-smd-postgres -e SMD_DBOPTS="sslmode=disable" -e SMD_DBPASS=hmsdsuser -d dtr.dev.cray.com:443/cray/cray-smd-init:latest
sudo docker pull dtr.dev.cray.com:443/cray/cray-smd:latest
sudo docker run --name smd --net host -p 27779:27779 -e SMD_DBHOST=127.0.0.1 -e SMD_DBPASS=hmsdsuser -e SMD_DBOPTS="sslmode=disable" -e SMD_PROXY="socks5://127.0.0.1:9999" -d dtr.dev.cray.com:443/cray/cray-smd:latest
bshields@shasta-sms:~> curl -k https://localhost:27779/hsm/v2/groups
[]
Find the machine you wish to discover and ssh to it with dynamic port forwarding enabled on the local port you gave for SMD_PROXY:
ssh -D 9999 root@example-sms.us.cray.com
Leave this window open until you are finished with the discovery.
Double check /etc/hosts for the BMC IP addresses that are assigned to the nodes you wish to discover, in case they are non-standard ones
If the proxy has been set up (or you are running locally on an SMS), then you can then create endpoints for every BMC you wish to discover using their native BMC IP addresses.
NOTE: If you need particular NIDs and Roles, you will need to set up xname entries in /hsm/v2/Defaults/NodeMaps BEFORE discovery OR patch the NID and/or Role fields after discovery:
Example creation and discovery of preview system computes
These are the usual computes found on a standard preview system, but you can easily adapt this example for whatever is in /etc/hosts. Just make sure you use the BMC xname and a raw IP (if using a socks5 proxy):
curl -k -d '{"ID": "x0c0s28b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.5", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s26b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.6", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s24b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.7", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
curl -k -d '{"ID": "x0c0s21b0", "RediscoverOnUpdate":true, "Hostname":"10.4.0.8", "User": "root","Password": "somePassword"}' -H "Content-Type: application/json" -X POST https://localhost:27779/hsm/v2/Inventory/RedfishEndpoints
NOTE: the above path is assuming you are running docker in a bare container (see above). Otherwise use 'https://
Also note that inventory discovery is a read-only operation and should not do anything to the endpoints besides walk them via GETs. The "RediscoverOnUpdate":true field is important because it will automatically kick off inventory discovery.
Running a plain docker container is not really practical in a full helm-based deployment because of the lack of integration and features that are provided via helm.
The easiest way to add nodes via discovery is to find nodes that have externally visible IP addresses for their BMCs.
I've not gotten the socks5 method above to work in kubernetes, but it might be something simple. Logging into each cray-smd pod using kubectl exec and doing "apk add openssh" will allow you to install ssh and use it to connect to external hosts, however the -D option gives an error logging in. In any case, you would have to reroll the values.yaml helm chart for cray-smd (and incrememnt the version number in Chart.yaml) to add the SMD_PROXY env variable (see above)
Accessing Postgres Operator
You can access the postgres database cluster as follows:
sms-1:~ # kubectl get pods -n services | grep smd
sms-1:~ # kubectl exec -it -n services cray-smd-{id-from-previous} -- /bin/sh
/ # cat /secrets/postgres/hmsdsuser/password # Copy to clipboard
/ # psql hmsds hmsdsuser -h cray-smd-postgres-0 -W
(Paste password)
At this point, you have two options.
https://www.postgresql.org/docs/11/backup-dump.html
WARNING: This only works for some kinds of testing as it creates incomplete configurations (but should work with group, partition, and State/Components calls, though remember that normally there will be non-Node components you will need to filter out if that's desired).
Moreover, even in this case, adding entries manually can create a corrupt database even if data is correct but improperly normalized. Best to not stray from the example below except to change the slot number in the xname and the NID.
(cray-smd-container) # psql hmsds hmsdsuser -h cray-smd-postgres-0 -W
<enter password from /secrets/postgres/hmsdsuser/password, see above>
hmsds=> insert INTO components (id,type,state,flag,enabled,admin,role,nid,subtype,nettype,arch) VALUES('x0c0s0b0n0','Node','Empty','OK',true,'','Compute',123,'','Sling','X86');
INSERT 0 1 # <- SUCCESS, running on primary
ERROR: cannot execute INSERT in a read-only transaction # <- Not on primary postgres pod
hmsds=> \q
hmsds=> SELECT * FROM table_name;
hmsds=> SELECT * FROM components;
hmsds=> SELECT * FROM components
hmsds-> WHERE type = 'Node';
hmsds=> DELETE FROM components WHERE id = 'x3000c0s19b1n0';
hmsds=> insert INTO components (id,type,state,flag,enabled,admin,role,nid,subtype,nettype,arch) VALUES('x0c0s0b0n0','Node','Empty','OK',true,'','Compute',123,'','Sling','X86');
hmsds=> \dt
hmsds=> \d table_name
hmsds=> \s
hmsds=> \h
I don't have step-by-step instructions but the basic idea is to:
For more info: https://www.postgresql.org/docs/11/backup-dump.html
Note if it is easier, you can run pg_dump/psql on the sms, but you must use the IP, not the hostname, of the cray-smd-postgres-[012] service. Running kubectl get services -n services will get you this IP.