Open tverdoff opened 6 years ago
Hi,
I would appreciate if you could try with v4.1.0 or later which contains #977 that includes a related fix.
IIUC, the behavior of sending a hardware MAC as gratuitous ARPs is not expected in this case and must be accidentally introduced in the past. #977 fixes it so that it will not send gratuitous ARPs when it is a cluster IP, so you should no longer see them at step 6) in your scenario and routers would continue to refer to the virtual MAC. I believe it works.
Hi, OK, thanks for the update, I'll let you know the result
Looks like updating the resource-agents solved this problem.
Here is what I had to do on my Centos 7, because there is no latest rpm available yet.
cd /tmp
git clone https://github.com/ClusterLabs/resource-agents.git
cd resource-agents/
./autogen.sh
yum -y install docbook-style-xsl glib2-devel
./configure
make
rm -fv /usr/lib/ocf/resource.d/heartbeat/.ocf-*
make install
I had to do rm -fv /usr/lib/ocf/resource.d/heartbeat/.ocf-*
because otherwise make install
fails for some reason (probably worth looking into).
After upgrading the resource-agents by hand, my system is not sending out ARPs anymore:
Nov 18 19:43:33 haproxy-02 IPaddr2(cluster_ip:0)[1796657]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-10.0.0.80 eth1 10.0.0.80 35fc518e5745 not_used not_used
Nov 18 19:43:33 haproxy-02 IPaddr2(cluster_ip:0)[1796657]: INFO: send_arp.linux: Gratuitous ARPs are not sent in the Cluster IP configuration
Any updates?
Hello community,
I have trapped into the issue with ARP cache on a router which remembers node physical MAC instead of using virtual MAC (corosync+pacemaker 1.1.15/1.1.18 cluster with cloned VIP).
Here it is:
We have cluster:
STR: 1) Start cluster and ping vip 2) on 10.0.2.12
MAC is correct refers to virtual mac
3) on node1 (10.0.2.10)
sudo crm node standby
sending node to standby mode
4) on 10.0.2.12
nothing happens 10.0.2.12 still refers to virtual mac for 10.0.2.120 vip
5) again on node1 (10.0.2.10)
> sudo crm node online
6) on 10.0.2.12 now we see gratuitous ARPs
However nobody among cluster nodes answers request (should they?) and 10.2.0.12 gets MAC of the recently brought up node1 instead of virtual MAC
Anyway gratuitous ARP request contains hardware MAC address, but not virtual.
Since network router has ARP cache it caches the wrong mac for 4 hours in particular and a half of requests to the VIP fails then.
As a workaround we can decrease ARP cache timeout on the router, but isn't it a bug?
Could you clarify if it this behavior is expected? Can we fix it on the cluster side?
Have read mail queue https://lists.clusterlabs.org/pipermail/pacemaker/2009-October/026763.html which seems similar to me, but I don't have such an error when executing send_arp command.
on the node1
on the 10.0.2.12
Please let me know if more info is required