ClusterLabs / resource-agents

Combined repository of OCF agents from the RHCS and Linux-HA projects
GNU General Public License v2.0
493 stars 582 forks source link

VirtualDomain RA fails to stop after failure to start #137

Closed mrichar1 closed 12 years ago

mrichar1 commented 12 years ago

I created a VirtualDomain RA configuration, but failed to put the libvirt xml for the guest vm in place.

When the RA was started, it failed to start with rc=5 (not installed) as expected, and tried to migrate the guest away to another node.

However, in doing so, it called 'stop' against the RA, which also failed with the same rc=5 error. This apparent failure to stop resulted in a stonith of the node.

Can VirtualDomain be fixed to avoid this?

Abbreviated Logs: node1 (vm host):

Sep 24 15:07:18 node1 VirtualDomain(vm4)[22276]: ERROR: Configuration file /etc/libvirtcfg/vm4.xml does not exist or is not readable. Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_start_0 (call=54, rc=5, cib-update=68, confirmed=true) not installed Sep 24 15:07:18 node1 lrmd: [5631]: info: rsc:vm4:55: stop Sep 24 15:07:18 node1 VirtualDomain(vm4)[22310]: ERROR: Configuration file /etc/libvirtcfg/vm4.xml does not exist or is not readable. Sep 24 15:07:18 node1 crmd[5634]: info: process_lrm_event: LRM operation vm4_stop_0 (call=55, rc=5, cib-update=69, confirmed=true) not installed STONITH

node2 (DC calling stonith):

Sep 24 15:07:18 node2 pengine[5057]: notice: unpack_rsc_op: Preventing vm4 from re-starting on node1: operation start failed 'not installed' (rc=5) Sep 24 15:07:18 node2 pengine[5057]: warning: unpack_rsc_op: Processing failed op vm4_last_failure_0 on node1: not installed (5) Sep 24 15:07:18 node2 pengine[5057]: warning: common_apply_stickiness: Forcing vm4 away from node1 after 1000000 failures (max=5) Sep 24 15:07:18 node2 pengine[5057]: notice: LogActions: Stop vm4#011(node1) Sep 24 15:07:18 node2 stonith-ng[5054]: info: initiate_remote_stonith_op: Initiating remote operation reboot for node1: 6d82f19a-75d6-41d5-815d-36176b57d940

fghaas commented 12 years ago

Hi Matthew, this should have been fixed upstream in commit 2081c456 -- are you sure you're using the latest version?

dmuhamedagic commented 12 years ago

Definitely fixed. My bad for suggesting to report this :) Closing.