chaos / powerman

cluster power control
GNU General Public License v2.0
43 stars 19 forks source link

powerman: support new setresult directive #161

Closed chu11 closed 7 months ago

chu11 commented 8 months ago

Problem: In many cases, there is no way for a power operation (i.e. power on, but not power status) to inform powerman that an error has occurred in the operation. The user will always get a "Command completed successfully" output and exit status 0 after issuing a power operation.

Solution: Support a new "setresult" statement that can inform powerman that a power operation did not succeed. A regex can be used to determine what output is expected of a successful power operation. If any are not successful, powerman can subsequently inform the user an error has occurred, leading to a "Command completed with errors" message and exit status 1.

Some example uses:

script on_all {
    send "on\n"
    foreachnode {
        expect "([^\n:]+): ([^\n]+\n)"
        setresult $1 $2 success="^ok\n"
    }
    expect "redfishpower> "
}
script on {
    send "on %s\n"
    expect "([^\n:]+): ([^\n]+\n)"
    setresult $1 $2 success="^ok\n"
    expect "redfishpower> "
}
chu11 commented 7 months ago

i'm going to close this as I've recently merged this into my mega cray ex chassis PR chain. Don't want to mess up the order of all the broken up PRs.