Open sudhiaithal opened 5 months ago
root@f4b1252e2cc5:/# grep "asyncOnLinkMsg" /var/log/syslog | wc -l 1050 root@f4b1252e2cc5:/#
However on old, just.1 for each interface
root@de88276cddc7:/# grep "asyncOnLinkMsg" /var/log/syslog | grep Ethernet | wc -l 103 root@de88276cddc7:/#
I was able to get around this problem by creating veth interface eth0-31 , that way all Ethernet* interface can map to a tap interface. After that this problem seems to go away
not sure if this is exact syncd issue, depends who is responsible to generate this netlink messages, syncd is listening to all those messages, but port up/down is not up to syncd, is this on real hardware or virtual switch ?
this is on virtual switch. I think flood of messages is causing some lock up on netlink socket of sycnd. So, if we just bring up VS without all veth interfaces up then I see this issue. Seems to work fine when all veth interfaces are created before VS bringup
Netlink is sy chronized in sync each message is processed in synchroonized block under mutex but it should receive all meswges, are you generating food on purpose ? Is any other procesu recdiving all generated messages ?
I am seeing an issue where syncd is not getting netlink message when link is added/deleted/up/down.
When syncd starts, it is getting all the messags as expected Feb 15 18:11:04.828518 d809e83f1ad0 NOTICE #syncd: :- asyncOnLinkMsg: received RTM_NEWLINK ifname: lo, ifflags: 0x10049, ifindex: 1 Feb 15 18:11:04.828550 d809e83f1ad0 NOTICE #syncd: :- asyncOnLinkMsg: received done RTM_NEWLINK ifname: lo, ifflags: 0x10049, ifindex: 1 Feb 15 18:11:04.828651 d809e83f1ad0 NOTICE #syncd: :- asyncOnLinkMsg: received RTM_NEWLINK ifname: eth0, ifflags: 0x11043, ifindex: 1745 Feb 15 18:11:04.828664 d809e83f1ad0 NOTICE #syncd: :- asyncOnLinkMsg: received done RTM_NEWLINK ifname: eth0, ifflags: 0x11043, ifindex: 1745
However, after a while when I do ifconfig eth0 up/down, syncd does not get any message but other process such as portsyncd gets
Feb 15 20:03:01.516713 874a0e235413 NOTICE #portsyncd: :- onMsg: nlmsg type:16 key:eth0 admin:0 oper:0 addr:02:42:ac:11:00:02 ifindex:4106 master:0 type:veth Feb 15 20:03:03.236337 874a0e235413 NOTICE #portsyncd: :- onMsg: nlmsg type:16 key:eth0 admin:1 oper:0 addr:02:42:ac:11:00:02 ifindex:4106 master:0 type:veth Feb 15 20:03:03.236486 874a0e235413 NOTICE #portsyncd: :- onMsg: nlmsg type:16 key:eth0 admin:1 oper:1 addr:02:42:ac:11:00:02 ifindex:4106 master:0 type:veth Feb 15 20:03:03.247468 874a0e235413 NOTICE #fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 172.17.0.0/16 0.0.0.0 eth0 Feb 15 20:03:05.082645 874a0e235413 NOTICE #fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 ....
This is preventing from updating correct oper status, VS image old branch 202106, It works correctly as shown by below
Feb 13 18:43:44.509381 de88276cddc7 NOTICE #syncd: :- asyncOnLinkMsg: received RTM_NEWLINK ifname: eth5, ifflags: 0x11103, ifindex: 93 Feb 13 18:43:44.509409 de88276cddc7 NOTICE #syncd: :- asyncOnLinkMsg: received RTM_NEWLINK ifname: eth5, ifflags: 0x11143, ifindex: 93 Feb 13 18:43:44.509458 de88276cddc7 NOTICE #portsyncd: :- onMsg: nlmsg type:16 key:eth5 admin:1 oper:0 addr:7a:01:26:fd:50:d5 ifindex:93 master:0 type:veth Feb 13 18:43:44.509485 de88276cddc7 NOTICE #syncd: :- syncOnLinkMsg: newlink: ifindex: 93, ifflags: 0x11103, ifname: eth5 Feb 13 18:43:44.509535 de88276cddc7 NOTICE #syncd: :- send_port_oper_status_notification: send event SAI_SWITCH_ATTR_PORT_STATE_CHANGE_NOTIFY for port oid:0x100000005: SAI_PORT_OPER_STATUS_UP Feb 13 18:43:44.509627 de88276cddc7 NOTICE #syncd: :- syncOnLinkMsg: newlink: ifindex: 93, ifflags: 0x11143, ifname: eth5 Feb 13 18:43:44.509719 de88276cddc7 NOTICE #portsyncd: :- onMsg: nlmsg type:16 key:eth5 admin:1 oper:1 addr:7a:01:26:fd:50:d5 ifindex:93 master:0 type:veth