jpirko / libteam

team netdevice library
GNU Lesser General Public License v2.1
231 stars 59 forks source link

Traffic lost when interface on Cisco's switch is on suspended state using LACP #37

Open avillalobos opened 6 years ago

avillalobos commented 6 years ago

We found that teamd doesn't work well when a Cisco switch is being turned on and either the server and switch configured using LACP. The version of teamd used is:

[root@ts-intcluster-node-02 ~]# teamd -v
teamd 1.27

It turns out that, when the switch comes up, the port start on "suspended" state, which means, link up but not traffic. When this happen, teamd doesn't use the failover port and tries to use the recently brought up port, but that port doesn't work because is on a suspended state, even worst, from the OS point of view, both links are shown as down even when from the switch are up and though the other link is working normally, to fix this issue, the workaround is turn off the server's port connected to the switch's suspended port, doing that, the teamd runner moves all the traffic to the other port and recover the link.

On the other hand, the link is recovered, but just after the switch's port moves from suspended state to up. It seems that when the Cisco switch is on suspended state, teamd daemon fails and the traffic is not redirected to the other port. it seems that there is a know bug for teamd and somehow solved, but we are not using Redhat but Centos.

There is also a redhat report from the mentioned bug and the workaround. [https://access.redhat.com/solutions/2362921]   The workaround used on the server side to bypass this was to use bonding driver. We used bonding on mode 4 (802.3ad). This driver pass all the failover tests, including port's disconnection, leaf shutdown and ACI shutdown.

brontide commented 5 years ago

Adding to this we can't get LACP running to work properly with Cisco Nexus 9k switches. Bonding driver works fine. The best we've been able to debug is that the LDAP runner does not keep in sync and the Cisco will ignore packets sent with the wrong flags. This has come to a head for us since bonding under EL7.7 doesn't come up with a default route, but teaming doesn't work until we cycle the port after each boot.

brontide commented 5 years ago

Update: We finally convinced support to upgrade the Nexus 9k to a recent code release and LACP now works as expected. The LACP packets are treated as malformed and not sent to the SUP for processing. There are a number of related cisco bugs touching on the same topic. The lack of bonding default route is now a documented NetworkManager bug.

luchocorral commented 4 years ago

A couple of years late but for what it's worth, indeed there are several bugs still today on the Nexus 9k hardware line related to LACP but it's getting better with each release at least for NX-OS.

Back in 2018 the issue we ran into with @avillalobos was actually 2 separate issues, first the teamd driver forwarding traffic over the affected link right after it came up (i.e. ethernet successfully bringing the link up) but before negotiating any LACP parameters with the switch, which was dropping everything at that point being in the "suspended" state, but also there was an issue on the Cisco ACI side setting the LACP vPC with a feature called "vPC auto-recovery" on the switches, which automatically adds a default timer of 240 seconds where the port is "suspended" after the affected switch cannot see its peer due to the reload (i.e. the other ACI switch you are using to bundle the links coming from the server); Cisco does this to avoid split-brain scenarios. In NX-OS you can disable such feature but in ACI (at least as of release 3.2(2I)) it was hardcoded, meaning that even with the workaround we implemented using bonding mode 4 instead of teaming, we could only load balance traffic from the server to the switches after those 4 minutes of waiting.