thom311 / libnl

Netlink Library Suite
GNU Lesser General Public License v2.1
419 stars 311 forks source link

2nd RTM_NEWLINK notification with operstate down is always 1 second delayed #374

Closed deepaktabraham closed 5 months ago

deepaktabraham commented 5 months ago

I have a system configured with 2 physical eth interfaces connected to a switch.

When I reboot the switch, I see that the userspace RTM_NEWLINK notifications for the interfaces are always 1 second apart although both links actually go down almost simultaneously!

From /var/log/messages (logs have the granularity of a second, but can see both links go down almost simultaneously) – Apr 3 12:20:49 m2-dl360g10-244 kernel: mlx5_core 0000:5d:00.0 eno5np0: Link down Apr 3 12:20:49 m2-dl360g10-244 kernel: mlx5_core 0000:5d:00.1 eno6np1: Link down


Monitoring userspace link events with ip command (linkdown is always 1 second apart; order varies from run to run) – [root]$ ip -t monitor link Timestamp: Wed Apr 3 12:20:49 2024 365932 usec 6: eno5np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default link/ether 88:e9:a4:54:89:4c brd ff:ff:ff:ff:ff:ff altname enp93s0f0np0

Timestamp: Wed Apr 3 12:20:50 2024 382616 usec 7: eno6np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default link/ether 88:e9:a4:54:89:4d brd ff:ff:ff:ff:ff:ff altname enp93s0f1np1


I wrote a simple test program that uses a netlink socket to get these link event notifications in the userspace. The exact same behavior can be seen there as well – 2nd interfaces’ userspace notification is always delayed by 1 second.

This behavior is consistent across Debian 11, 12 and RHEL9 distros (with their corresponding stock libnl). This behavior is also not specific to a particular system. It should be easily reproducible on all hardware.

The subsequent RTM_NEWLINK notifications when the switch comes back up are however only delayed by a few microseconds between each other, which is expected.

Anyone knows why there is this 1 sec delay between RTM_NEWLINK notifications with operstate down in the userspace always, although both physical interfaces go down almost simultaneously? Also is there a way to not have this delay?

thom311 commented 5 months ago

Maybe during reboot, the Switch turns off the ports at different times? I don't know.

In any case, this seems related to the switch, the NIC, or the driver/module and not the client tool (libnl3). Also because ip does not use libnl3 library.

deepaktabraham commented 5 months ago

Looking at the iproute2 source code, it seems to use libnetlink. My test program uses libnl3. Both reports the same problem.

I think the switch isn't the culprit here because simultaneous port shutdown via the switch console also has this issue.

It is unlikely in the driver as well. Problem seems to be there with both Intel and Mellanox cards!

Moreover I just noticed that /sys/class/net/eth*/operstate also reports the change in state after 1 second for the 2nd interface. So the problem points to kernel I think.

Thank you for your inputs. I'll close this issue.

deepaktabraham commented 5 months ago

P.S. It seems to be in the kernel. I guess it's a "feature" not a "bug" :)

net/core/link_watch.c -

         /*
          * Limit the number of linkwatch events to one
          * per second so that a runaway driver does not
          * cause a storm of messages on the netlink
          * socket.  This limit does not apply to up events
          * while the device qdisc is down.
          */