Open wen587 opened 2 years ago
A few questions:
config portchannel del <>
, are you saying the same errors occur there? or only ERR teamd#tlm_teamd: :- get_dump
?Failed to remove ref count 1 LAG PortChannel0005
error mean, is it checking redis or checking something else? if it is a redis issue, maybe we need to double check how we interact with redis, if it is async, I think we should make it sync or introduce some waitLAG
? why is the error referring to itA few questions:
- Regarding
config portchannel del <>
, are you saying the same errors occur there? or onlyERR teamd#tlm_teamd: :- get_dump
?
Only ERR teamd#tlm_teamd: :- get_dump
- Also what does orcagent
Failed to remove ref count 1 LAG PortChannel0005
error mean, is it checking redis or checking something else? if it is a redis issue, maybe we need to double check how we interact with redis, if it is async, I think we should make it sync or introduce some wait
After read https://github.com/Azure/sonic-swss/blob/master/orchagent/portsorch.cpp#L5105-L5115, I think it is just not related to redis. Looks like a async issue. Not sure why m_port_ref_count
is not 0 during the PortChannel Interface removal.
- What is
LAG
? why is the error referring to it
LAG is link aggregation group, which is PortChannel in our code base. LAG removal refers to PortChannel removal.
It does not impact the final result. Current workaround is to keep the Log Analyzer error in ignored list.
Description
There seems to have some execution delay in swss when executing GCU jsonChange. The delay will cause SYSLOG ERR about removeLag. Possible execution delay related code: (Executed before portchannel removal)
ERR swss#intfmgrd: :- setIntfVrf: Command '/sbin/ip link set "PortChannel0005" nomaster' failed with rc 1
https://github.com/Azure/sonic-swss/blob/master/cfgmgr/intfmgr.cpp#L136-L154
ERR swss#orchagent: :- removeLag: Failed to remove ref count 1 LAG PortChannel0005
https://github.com/Azure/sonic-swss/blob/master/orchagent/portsorch.cpp#L5105-L5115
See below for more details.
Steps to reproduce the issue
admin@vlab-01:~/po/test$ sudo config apply-patch tc1.json ... Patch Applier: Applying 4 changes in order: Patch Applier: [{"op": "add", "path": "/PORTCHANNEL/PortChannel0005", "value": {"admin_status": "up"}}] Patch Applier: [{"op": "add", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005", "value": {}}] Patch Applier: [{"op": "add", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|10.0.0.64~131", "value": {}}] Patch Applier: [{"op": "add", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|FC00::81~1126", "value": {}}] Patch Applier: Verifying patch updates are reflected on ConfigDB. Patch Applier: Patch application completed. Patch applied successfully.
admin@vlab-01:~/po/test$ cat tc1_rm.json [ { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|FC00::81~1126" }, { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|10.0.0.64~131" }, { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005" }, { "op": "remove", "path": "/PORTCHANNEL/PortChannel0005" } ] admin@vlab-01:~/po/test$ sudo config apply-patch tc1_rm.json ... Patch Applier: Applying 4 changes in order: Patch Applier: [{"op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005"}] Patch Applier: [{"op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|10.0.0.64~131"}] Patch Applier: [{"op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|FC00::81~1126"}] Patch Applier: [{"op": "remove", "path": "/PORTCHANNEL/PortChannel0005"}] Patch Applier: Verifying patch updates are reflected on ConfigDB. Patch Applier: Patch application completed. Patch applied successfully.
May 10 03:17:06.199268 vlab-01 ERR swss#orchagent: :- removeLag: Failed to remove ref count 1 LAG PortChannel0005 May 10 03:17:06.199325 vlab-01 ERR teamd#tlm_teamd: :- get_dump: Can't get dump for LAG 'PortChannel0005'. Skipping May 10 03:17:07.241855 vlab-01 ERR swss#intfmgrd: :- setIntfVrf: Command '/sbin/ip link set "PortChannel0005" nomaster' failed with rc 1 May 10 03:17:07.241855 vlab-01 ERR swss#orchagent: message repeated 4 times: [ :- removeLag: Failed to remove ref count 1 LAG PortChannel0005]
admin@vlab-01:~/po/test$ cat tc1_part1.json [ { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|FC00::81~1126" }, { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005|10.0.0.64~131" }, { "op": "remove", "path": "/PORTCHANNEL_INTERFACE/PortChannel0005" }]
admin@vlab-01:~/po/test$ cat tc1_part2.json [ { "op": "remove", "path": "/PORTCHANNEL/PortChannel0005" } ]
admin@vlab-01:~/po/test$ show ver
SONiC Software Version: SONiC.master-10763.96436-aa5cdcc51 Distribution: Debian 11.3 Kernel: 5.10.0-8-2-amd64 Build commit: aa5cdcc51 Build date: Fri May 6 06:25:04 UTC 2022 Built by: AzDevOps@sonic-build-workers-001HFQ
Platform: x86_64-kvm_x86_64-r0 HwSKU: Force10-S6000 ASIC: vs ASIC Count: 1 Serial Number: N/A Model Number: N/A Hardware Revision: N/A Uptime: 03:22:25 up 23:34, 2 users, load average: 0.07, 0.16, 0.17 Date: Tue 10 May 2022 03:22:25