ClusterLabs / fence-agents

Fence agents
104 stars 157 forks source link

Unable to fence nodes with fence_pve using pacemaker #584

Closed aichimansible closed 3 months ago

aichimansible commented 3 months ago

Version: 4.14.0.13-2d9c I can fence nodes with the following command, but I cannot fence them using pacemaker:

fence_pve -a 192.168.1.3 -A -l test2@pve -p secret -n 204 –ssl-insecure -o reboot
I enabled the verbose mode for the fencing device and this is the relevant section from the logs:

]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ > GET /api2/json/nodes/ich-srv-01/qemu/lab5/status/current HTTP/1.1
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ Host: 192.168.1.3:8006
Jun 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ User-Agent: PycURL/7.43.0.2 libcurl/7.61.1 OpenSSL/1.1.1g zlib/1.2.11 brotli/1.0.6 libidn2 ].2.0 libpsl/0.20.2 (+libidn2/2.2.0) libssh/0.9.4/openssl/zlib nghttp2/1.33.0
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ Accept: */*
Jun 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ Cookie: PVEAuthCookie=PVE:test2@pve:66689F0A::DZ2sVOm4QD/9CIzGD3TtVQIh6MUYa+sJvASCOg4hLSts9hQhDtqlwHriwesNVbKOFAl5yA1GW5yClEu9xacwLP5uJhwjc6dq0EvxstWKiaF9A0yZ+m4M/DYDUYqSOSp2+UthoKq42Aa+8msGe/OiBMEncob7t5pJYgJPS2F2gvbWqF9XBeg4FraqZ0Ag2Y+zr2Tgxq97n8Os++PrwEDkQuDNUWd4ylGTuuk+fcvlnMr30+Ww7lknoH7gclHDVHgM3KYnPAB2dc7METIioITAN5LOY7c/DnPGHnyYSDY9J5hf/kE/0ejODlTHXcqNR/pJcXTbmN1AGqgS8ce+NNctjw==; version=
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ CSRFPreventionToken: 66689F0A:1tVSSATItHTud3twPmJC6vAxZFRvBU18qFdGJwhm7Bs
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < HTTP/1.1 400 Parameter verification failed.
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Cache-Control: max-age=0
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Connection: close
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Date: Tue, 11 Jun 2024 19:01:30 GMT
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Pragma: no-cache
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Server: pve-api-daemon/3.0
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Content-Length: 76
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Content-Type: application/json;charset=UTF-8
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ < Expires: Tue, 11 Jun 2024 19:01:30 GMT
]n 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ <
Jun 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ * Closing connection 0 ]
Jun 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ 2024-06-11 22:01:30,734 DEBUG: RESULT [400]: {"errors":{"vmid":"type check ('integer') failed – got 'lab5'"},"data":null} ]
Jun 11 22:01:30 lab4.localdomain pacemaker-fenced [51707] (log_action) warning: fence_pve[57039] stderr: [ 2024-06-11 22:01:30,734 ERROR: Failed: Unable to obtain correct plug status or plug is not available ]

Here is the stonith config:

[root@lab4 tests]# pcs stonith config
Resource: fence_lab5 (class=stonith type=fence_pve)
Attributes: fence_lab5-instance_attributes
ip=192.168.1.3
password=secret
pcmk_host_check=static-list
pcmk_host_list=lab5
plug=204
pve_node_auto=true
ssl_insecure=true
username=test2@pve
verbose=yes
vmtype=qemu
Operations:
monitor: fence_lab5-monitor-interval-60s
interval=60s

[root@lab4 tests]# pcs stonith fence lab5
Error: unable to fence 'lab5'
stonith_admin: Couldn't fence lab5: No data available

I can confirm that the vmid is 100% correct. I am not sure if this issue should be addressed here:

RESULT [400]: {"errors":{"vmid":"type check ('integer') failed – got 'lab5'"},"data":null} ]

oalbrigt commented 3 months ago

Run pcs stonith update fence_lab5 plug= pcmk_host_check= pcmk_host_list pcmk_host_map=lab5:204 to solve the issue. You can use ; in pcmk_host_map to map more nodes to plugs

aichimansible commented 3 months ago

Thank you very much, sir! This fixed my issue. I followed a technical article to set this up and the pcmk_host_map option was not mentioned.