I created a VirtualDomain RA configuration, but failed to put the libvirt xml for the guest vm in place.
When the RA was started, it failed to start with rc=5 (not installed) as expected, and tried to migrate the guest away to another node.
However, in doing so, it called 'stop' against the RA, which also failed with the same rc=5 error. This apparent failure to stop resulted in a stonith of the node.
Can VirtualDomain be fixed to avoid this?
Abbreviated Logs:
node1 (vm host):
Sep 24 15:07:18 node1 VirtualDomain(vm4)[22276]: ERROR: Configuration file
/etc/libvirtcfg/vm4.xml does not exist or is not readable.
Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_start_0 (call=54, rc=5, cib-update=68, confirmed=true) not installed
Sep 24 15:07:18 node1 lrmd: [5631]: info: rsc:vm4:55: stop
Sep 24 15:07:18 node1 VirtualDomain(vm4)[22310]: ERROR: Configuration file /etc/libvirtcfg/vm4.xml does not exist or is not readable.
Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_stop_0 (call=55, rc=5, cib-update=69, confirmed=true) not installed
STONITH
node2 (DC calling stonith):
Sep 24 15:07:18 node2 pengine[5057]: notice: unpack_rsc_op: Preventing vm4 from re-starting on node1: operation start failed 'not installed' (rc=5)
Sep 24 15:07:18 node2 pengine[5057]: warning: unpack_rsc_op: Processing failed op vm4_last_failure_0 on node1: not installed (5)
Sep 24 15:07:18 node2 pengine[5057]: warning: common_apply_stickiness: Forcing vm4 away from node1 after 1000000 failures (max=5)
Sep 24 15:07:18 node2 pengine[5057]: notice: LogActions: Stop vm4#011(node1)
Sep 24 15:07:18 node2 stonith-ng[5054]: info: initiate_remote_stonith_op: Initiating remote operation reboot for node1: 6d82f19a-75d6-41d5-815d-36176b57d940
I created a VirtualDomain RA configuration, but failed to put the libvirt xml for the guest vm in place.
When the RA was started, it failed to start with rc=5 (not installed) as expected, and tried to migrate the guest away to another node.
However, in doing so, it called 'stop' against the RA, which also failed with the same rc=5 error. This apparent failure to stop resulted in a stonith of the node.
Can VirtualDomain be fixed to avoid this?
Abbreviated Logs: node1 (vm host):
Sep 24 15:07:18 node1 VirtualDomain(vm4)[22276]: ERROR: Configuration file /etc/libvirtcfg/vm4.xml does not exist or is not readable. Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_start_0 (call=54, rc=5, cib-update=68, confirmed=true) not installed Sep 24 15:07:18 node1 lrmd: [5631]: info: rsc:vm4:55: stop Sep 24 15:07:18 node1 VirtualDomain(vm4)[22310]: ERROR: Configuration file /etc/libvirtcfg/vm4.xml does not exist or is not readable. Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_stop_0 (call=55, rc=5, cib-update=69, confirmed=true) not installed STONITH
node2 (DC calling stonith):
Sep 24 15:07:18 node2 pengine[5057]: notice: unpack_rsc_op: Preventing vm4 from re-starting on node1: operation start failed 'not installed' (rc=5) Sep 24 15:07:18 node2 pengine[5057]: warning: unpack_rsc_op: Processing failed op vm4_last_failure_0 on node1: not installed (5) Sep 24 15:07:18 node2 pengine[5057]: warning: common_apply_stickiness: Forcing vm4 away from node1 after 1000000 failures (max=5) Sep 24 15:07:18 node2 pengine[5057]: notice: LogActions: Stop vm4#011(node1) Sep 24 15:07:18 node2 stonith-ng[5054]: info: initiate_remote_stonith_op: Initiating remote operation reboot for node1: 6d82f19a-75d6-41d5-815d-36176b57d940