xcat2 / xcat-core

Code repo for xCAT core packages
Eclipse Public License 1.0
367 stars 172 forks source link

R commands issue with Inspur's openbmc with 2.16.5 on Red Hat 8.7 with latest patches #7387

Open Bob-Krull opened 1 year ago

Bob-Krull commented 1 year ago

We have several Inspur machines that run OpenBmc for hardware management. When we were running 2.16.4 I had no problems discovering them and installing them. After the upgrade to 2.16.5 I can no longer communicate to them with any R command(rpower,rinv,rinstall) they all return errors like

 > rpower <node> state
Error: [admin-1]: openbmc plugin bug, pid 1740582, process description: 'xcatd SSL: rpower to <node> for root@localhost: openbmc instance' with error 'malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "Not Found") at /opt/xcat/lib/perl/xCAT_plugin/openbmc.pm line 2558.
' while trying to fulfill request for the following nodes: <node>

> rinstall <node>
Provision node(s): b77r44u17-node
Error: [admin-1]: rinstall plugin bug, pid 2531974, process description: 'xcatd SSL: rinstall to b77r44u17-node for root@localhost: rinstall instance' with error 'Died at /opt/xcat/sbin/xcatd line 2104.
' while trying to fulfill request for the following nodes: b77r44u17-node

tried a command this way and get a different result

>  XCATBYPASS=1 rinstall <node>
Provision node(s): <node>
malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "Not Found") at /opt/xcat/lib/perl/xCAT_plugin/openbmc.pm line 2558.

Here's some output with xcatdebugmode=1

> rinstall <node> -u
Provision node(s): <node>
Error: [admin-1]: rinstall plugin bug, pid 2614522, process description: 'xcatd SSL: rinstall to <node> for root@localhost: rinstall instance' with error 'Died at /opt/xcat/sbin/xcatd line 2104.
' while trying to fulfill request for the following nodes: <node>

> rpower <node> state
Tue May 23 09:22:45 2023 OpenBMC: [openbmc_debug_perl]
Tue May 23 09:22:45 2023 <node>: [openbmc_debug_perl] curl -k -c cjar -H "Content-Type: application/json" -d '{ "data": ["root", "xxxxxx"] }' https://<node>/login
Tue May 23 09:22:45 2023 <node>: [openbmc_debug_perl] login_response 200 OK
Tue May 23 09:22:45 2023 <node>: [openbmc_debug_perl] curl -k -b cjar -X GET -H "Content-Type: application/json" https://<node>/xyz/openbmc_project/state/enumerate
Tue May 23 09:22:45 2023 <node>: [openbmc_debug_perl] rpower_status_response 404 Not Found
Error: [admin-1]: openbmc plugin bug, pid 2614711, process description: 'xcatd SSL: rpower to b77r44u17-node for root@localhost: openbmc instance' with error 'malformed JSON string, neither array, object, number, string or atom, at character offset 0 (before "Not Found") at /opt/xcat/lib/perl/xCAT_plugin/openbmc.pm line 2558.
' while trying to fulfill request for the following nodes: <node>

I do have an environment running all Power machines with OpenBmc and they do not exibit and issues. Firmware on the inspur's has not changed. Any ideas or help would be appreciated.

gurevichmark commented 1 year ago

@Bob-Krull This seems to be the same problem as described by #6173 Most likely the REST server on the BMC is not replying. Try logging on to the BMC and checking its status.

Bob-Krull commented 1 year ago

unfortunately these systems are not implemented quite the same way as the witherspoons. That service does not appear to exist. Interesting though is that I can run the curl login command and that seems to return OK . I can access the bmc's and look around. Don't see anything obvious. Tried rebooting them several times also.

gurevichmark commented 1 year ago

Can you manually run:

curl -k -c cjar -H "Content-Type: application/json" -d '{ "data": ["root", "xxxxxx"] }' https://<node>/login
curl -k -b cjar -X GET -H "Content-Type: application/json" https://<node>/xyz/openbmc_project/state/enumerate

And see if you get some data back ?

Bob-Krull commented 1 year ago

nope, besides login beinbg ok

> curl -k -c cjar -H "Content-Type: application/json" -d '{ "data": ["root", "0penBmc"] }' https://$bmc/login
{
  "data": "User 'root' logged in",
  "message": "200 OK",
  "status": "ok"
> curl -k -b cjar -X GET -H "Content-Type: application/json" https://$bmc/xyz/openbmc_project/state/enumerate
Not Found

it really does respond with "Not Found" just did a factory reset of this box also.. I suspect something changed somewhere. May not even be an xcat issue.

gurevichmark commented 1 year ago

Yea, I do not see any "openbmc" related file changes from 2.16.4 to 2.16.5