Closed nxgovind closed 2 weeks ago
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/1/input
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/2/input
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/3/input
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/4/input
@oalbrigt Thank you for your review comments. I have addressed all of them. Please let me know if I have missed anything else.
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/5/input
Can one of the admins check and authorise this run please: https://ci.kronosnet.org/job/fence-agents/job/fence-agents-pipeline/job/PR-600/6/input
retest this please
Thank you for your review. I ran a few tests with the latest changes on a 3-node CentOS 9 stream, cluster setup. All basic power operations via fence_nutanix_ahv works fine. Also, tested stonith feature by failing a node to confirm that pacemaker successfully resets the failed node. Documenting the test output here.
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o list-status --ssl-insecure TestVM1,8e1353ff-59a8-4683-af08-293036f08d4f,OFF TestVM2,bda19034-c121-430f-a70c-a872f9dbabf7,OFF Node 1,ae94f8c2-96f1-4c85-bb4a-b1cbd48aeee8,ON Node 2,bdd08b08-d11d-41b8-b59f-8f8ba77d9ae6,ON Node 3,c2c4f047-9a56-460f-9bbb-d7f6d81a2e0c,ON
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o list-status --filter="name eq 'TestVM1'" --ssl-insecure TestVM1,8e1353ff-59a8-4683-af08-293036f08d4f,OFF
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o on --plug='TestVM1' --ssl-insecure Success: Powered ON
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o reboot --plug='TestVM1' --ssl-insecure Success: Rebooted
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o list-status --filter="startswith(name, 'TestVM')" --ssl-insecure TestVM1,8e1353ff-59a8-4683-af08-293036f08d4f,ON TestVM2,bda19034-c121-430f-a70c-a872f9dbabf7,ON
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o off --plug='TestVM1' --ssl-insecure Success: Powered OFF
[root@vm2-auto ~]# fence_nutanix_ahv -a 10.101.63.173 -l admin -p Nutanix.123 -o list-status --filter="startswith(name, 'TestVM')" --ssl-insecure TestVM1,8e1353ff-59a8-4683-af08-293036f08d4f,OFF TestVM2,bda19034-c121-430f-a70c-a872f9dbabf7,OFF
tail -f /var/log/pacemaker/pacemaker.log Nov 07 11:07:40.933 node1 pacemaker-fenced [1134] (log_async_result) notice: Operation 'reboot' [1493] targeting node2 using nutanix_fence returned 0 | call 13 from pacemaker-controld.1382 Nov 07 11:07:40.964 node1 pacemaker-fenced [1134] (finalize_op) notice: Operation 'reboot' targeting node2 by node1 for pacemaker-controld.1382@node3: OK (complete) | id=2e28a260 Nov 07 11:07:40.965 node1 pacemaker-controld [1138] (handle_fence_notification) notice: Peer node2 was terminated (reboot) by node1 on behalf of pacemaker-controld.1382@node3: OK | event=2e28a260-bdd1-4154-b98b-1fd14227dc63
[root@node1 ~]# pcs status Cluster name: ha_cluster Cluster Summary:
Node List:
Full List of Resources:
@oalbrigt I have run some basic tests, including cluster node failure test. Please merge the pull request if you are comfortable with the tests.
Thanks.
This patch adds fence agent support for Nutanix AHV clusters. More specifically the initial support is aimed at AHV clusters that support Nutanix v4 APIs. V3 APIs are not supported.
Signed off by amir.eibagi@nutanix.com