Open ccy opened 1 year ago
I am trying mclag with 2 mellanox SN2410 switch. I build the sonic firmware using master branch of sonic-buildimage with
INCLUDE_ICCPD = y
.Switch 1 mclag configuration:
# Enable iccpd feature config feature state iccpd enabled # Peer link: Port Channel config portchannel add PortChannel2001 config portchannel member add PortChannel2001 Ethernet192 config portchannel member add PortChannel2001 Ethernet196 config switchport mode trunk PortChannel2001 # Peer Link: VLAN config vlan add 10 config vlan member add 10 PortChannel2001 # MCLAG domain config mclag add 1 192.168.10.1 192.168.10.2 PortChannel2001 config mclag unique-ip add Vlan10 config interface ip add Vlan10 192.168.10.1/24 config vlan add 100 config mclag unique-ip add Vlan100 config vlan member add 100 PortChannel2001 config interface ip add Vlan100 192.168.100.1/24 # Create mlag port channel interface ID=184 config portchannel add PortChannel$ID config portchannel member add PortChannel$ID Ethernet$ID config switchport mode trunk PortChannel$ID config mclag member add 1 PortChannel$ID config vlan member add 100 PortChannel$ID
Switch 2 mclag configuration:
# Enable iccpd feature config feature state iccpd enabled # Peer link: Port Channel config portchannel add PortChannel2001 config portchannel member add PortChannel2001 Ethernet192 config portchannel member add PortChannel2001 Ethernet196 config switchport mode trunk PortChannel2001 # Peer Link: VLAN config vlan add 10 config vlan member add 10 PortChannel2001 # MCLAG domain config mclag add 1 192.168.10.2 192.168.10.1 PortChannel2001 config mclag unique-ip add Vlan10 config interface ip add Vlan10 192.168.10.2/24 config vlan add 100 config mclag unique-ip add Vlan100 config vlan member add 100 PortChannel2001 config interface ip add Vlan100 192.168.100.2/24 # Create mlag port channel interface ID=184 config portchannel add PortChannel$ID config portchannel member add PortChannel$ID Ethernet$ID config switchport mode trunk PortChannel$ID config mclag member add 1 PortChannel$ID config vlan member add 100 PortChannel$ID
Runtime information for switch 1:
$ show int portchannel Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- --------------- ----------- ----------------------------- 184 PortChannel184 LACP(A)(Up) Ethernet184(S) 2001 PortChannel2001 LACP(A)(Up) Ethernet192(S) Ethernet196(S) $ mclagdctl dump state The MCLAG's keepalive is: OK MCLAG info sync is: completed Domain id: 1 Local Ip: 192.168.10.1 Peer Ip: 192.168.10.2 Peer Link Interface: PortChannel2001 Keepalive time: 1 sesssion Timeout : 15 Peer Link Mac: 04:3f:72:2f:38:00 Role: Active MCLAG Interface: PortChannel184 Loglevel: NOTICE $ mclagdctl dump mac -i 1 TYPE: S-STATIC, D-DYNAMIC; AGE: L-Local age, P-Peer age No. TYPE MAC VID DEV ORIGIN-DEV AGE 1 S 50:6b:4b:9b:82:c0 10 PortChannel2001 PortChannel2001 L 2 S 50:6b:4b:9b:82:c0 100 PortChannel2001 PortChannel2001 L 3 D 64:9d:99:17:cb:2c 100 PortChannel184 PortChannel184 P $ show vlan brief +-----------+------------------+-----------------+----------------+-------------+-----------------------+ | VLAN ID | IP Address | Ports | Port Tagging | Proxy ARP | DHCP Helper Address | +===========+==================+=================+================+=============+=======================+ | 10 | 192.168.10.1/24 | PortChannel2001 | tagged | disabled | | +-----------+------------------+-----------------+----------------+-------------+-----------------------+ | 100 | 192.168.100.1/24 | PortChannel184 | tagged | disabled | | | | | PortChannel2001 | tagged | | | +-----------+------------------+-----------------+----------------+-------------+-----------------------+ $ show mac No. Vlan MacAddress Port Type ----- ------ ----------------- --------------- ------- 1 10 50:6B:4B:9B:82:C0 PortChannel2001 Static 2 100 64:9D:99:17:CB:2C PortChannel184 Dynamic 3 100 50:6B:4B:9B:82:C0 PortChannel2001 Static
Runtime information for switch 2
$ show int portchannel Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected, * - not synced No. Team Dev Protocol Ports ----- --------------- ----------- ----------------------------- 176 PortChannel176 LACP(A)(Dw) Ethernet176(S*) 184 PortChannel184 LACP(A)(Up) Ethernet184(S) 2001 PortChannel2001 LACP(A)(Up) Ethernet196(S) Ethernet192(S) $ mclagdctl dump state The MCLAG's keepalive is: OK MCLAG info sync is: completed Domain id: 1 Local Ip: 192.168.10.2 Peer Ip: 192.168.10.1 Peer Link Interface: PortChannel2001 Keepalive time: 1 sesssion Timeout : 15 Peer Link Mac: 50:6b:4b:9b:82:c0 Role: Standby MCLAG Interface: PortChannel184 Loglevel: NOTICE $ mclagdctl dump mac -i 1 TYPE: S-STATIC, D-DYNAMIC; AGE: L-Local age, P-Peer age No. TYPE MAC VID DEV ORIGIN-DEV AGE 1 S 04:3f:72:2f:38:00 10 PortChannel2001 PortChannel2001 L 2 S 04:3f:72:2f:38:00 100 PortChannel2001 PortChannel2001 L 3 D 64:9d:99:17:cb:2c 100 PortChannel184 PortChannel184 L $ show vlan brief +-----------+------------------+-----------------+----------------+-------------+-----------------------+ | VLAN ID | IP Address | Ports | Port Tagging | Proxy ARP | DHCP Helper Address | +===========+==================+=================+================+=============+=======================+ | 10 | 192.168.10.2/24 | PortChannel2001 | tagged | disabled | | +-----------+------------------+-----------------+----------------+-------------+-----------------------+ | 100 | 192.168.100.2/24 | PortChannel184 | tagged | disabled | | | | | PortChannel2001 | tagged | | | +-----------+------------------+-----------------+----------------+-------------+-----------------------+ $ show mac No. Vlan MacAddress Port Type ----- ------ ----------------- --------------- ------ 1 10 04:3F:72:2F:38:00 PortChannel2001 Static 2 100 04:3F:72:2F:38:00 PortChannel2001 Static 3 100 64:9D:99:17:CB:2C PortChannel184 Static Total number of entries 3
Above states show switch-1 role is Active and switch-2 role is Standby. I assume my mc-lag configuration is correct at this moment.
Next I try to shutdown interface
Ethernet184
of switch-2:config int shutdown Ethernet184
After a while, few docker container crash and restarted. The mclag peering connections crash too. Those crash container/service will back online after some time.
This is the log captured during the crashing:
Sep 21 12:44:49.392405 sonic WARNING swss#orchagent: :- parsePortConfig: Unknown field(mode): skipping ... Sep 21 12:44:49.392405 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.398965 sonic NOTICE swss#portmgrd: :- doTask: Configure Ethernet184 MTU to 9100 Sep 21 12:44:49.406885 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Ethernet184 state down Sep 21 12:44:49.406885 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname Ethernet184 index 1689 is_vrf 0 address fe80::63f:72ff:fe2f:3800 Sep 21 12:44:49.406885 sonic NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet184 admin:0 oper:0 addr:04:3f:72:2f:38:00 ifindex:1689 master:1601 type:sx_netdev Sep 21 12:44:49.406885 sonic NOTICE swss#portmgrd: :- doTask: Configure Ethernet184 admin status to down Sep 21 12:44:49.406885 sonic NOTICE swss#portsyncd: :- onMsg: Publish Ethernet184(ok:down) to state db Sep 21 12:44:49.406885 sonic NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet184 admin:0 oper:0 addr:04:3f:72:2f:38:00 ifindex:1689 master:1601 type:sx_netdev Sep 21 12:44:49.415814 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port PortChannel184 state down Sep 21 12:44:49.416580 sonic INFO kernel: [94358.277233] Bridge: port 3(PortChannel184) entered disabled state Sep 21 12:44:49.424011 sonic WARNING swss#orchagent: message repeated 2 times: [ :- parsePortConfig: Unknown field(mode): skipping ...] Sep 21 12:44:49.424011 sonic NOTICE swss#orchagent: :- setPortAdminStatus: Set admin status DOWN host_tx_ready to false for port Ethernet184 Sep 21 12:44:49.424011 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: add fdb entry into ASIC_DB:key =Vlan100:64:9d:99:17:cb:2c, type =dynamic Sep 21 12:44:49.431302 sonic NOTICE swss#portsyncd: :- onMsg: Publish Ethernet184(ok:down) to state db Sep 21 12:44:49.431302 sonic NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:Ethernet184 admin:0 oper:0 addr:04:3f:72:2f:38:00 ifindex:1689 master:1601 type:sx_netdev Sep 21 12:44:49.431302 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetTrafficDisable: Set traffic disable for PortChannel184 Sep 21 12:44:49.431302 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set ADMIN_STATE, key:PORT [OID:0x1000F00000001] [log_port:0x1000F], val:false Sep 21 12:44:49.437820 sonic NOTICE swss#portsyncd: :- onMsg: Publish Ethernet184(ok:down) to state db Sep 21 12:44:49.437820 sonic NOTICE syncd#SDK: [SAI_SWITCH.NOTICE] ./src/mlnx_sai_switch.c[6689]- event_thread_func: Port 1000f changed state to down Sep 21 12:44:49.437820 sonic NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel184 admin:1 oper:0 addr:04:3f:72:2f:38:00 ifindex:1601 master:1578 Sep 21 12:44:49.437820 sonic NOTICE swss#portsyncd: :- onMsg: Cannot find PortChannel184 in port table Sep 21 12:44:49.437820 sonic NOTICE swss#portsyncd: :- onMsg: nlmsg type:16 key:PortChannel184 admin:1 oper:0 addr:04:3f:72:2f:38:00 ifindex:1601 master:1578 type:team Sep 21 12:44:49.440725 sonic NOTICE kernel: [94358.300817] Called for logical port - 1000F status DOWN Sep 21 12:44:49.441977 sonic NOTICE bgp#fpmsyncd: message repeated 2 times: [ :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 ] Sep 21 12:44:49.441977 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:44:49.441977 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:44:49.441977 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:44:49.441977 sonic NOTICE swss#orchagent: :- doPortTask: Set port Ethernet184 admin status to down Sep 21 12:44:49.441977 sonic NOTICE swss#orchagent: :- doTask: Get port state change notification id:1000000000408 status:2 Sep 21 12:44:49.441977 sonic NOTICE swss#orchagent: :- updatePortOperStatus: Port Ethernet184 oper state set from up to down Sep 21 12:44:49.441977 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set OPER_STATUS, key:HOSTIF [OID:0x2E0000000D] [hostif_db[46]], val:false Sep 21 12:44:49.446311 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.446311 sonic INFO lldp#lldpmgrd[36]: port Ethernet184 is not up, continue Sep 21 12:44:49.446311 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.452424 sonic INFO lldp#lldpmgrd[36]: port Ethernet184 is not up, continue Sep 21 12:44:49.452651 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.452827 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 Sep 21 12:44:49.453029 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:44:49.453185 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:44:49.453354 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:44:49.461411 sonic NOTICE swss#orchagent: :- setHostIntfsOperStatus: Set operation status DOWN to host interface Ethernet184 Sep 21 12:44:49.461411 sonic NOTICE swss#orchagent: :- publish: EVENT_PUBLISHED: {"sonic-events-swss:if-state":{"ifname":"Ethernet184","status":"down","timestamp":"2023-09-21T12:44:49.450308Z"}} Sep 21 12:44:49.461411 sonic WARNING swss#orchagent: :- flushFDBEntries: Couldn't flush FDB. Bridge port OID: 0x0 bvid:0, Sep 21 12:44:49.461411 sonic INFO lldp#lldpmgrd[36]: port Ethernet184 is not up, continue Sep 21 12:44:49.461411 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.461411 sonic INFO lldp#lldpmgrd[36]: port Ethernet184 is not up, continue Sep 21 12:44:49.461411 sonic INFO lldp#lldpmgrd[36]: Unable to retrieve description for port 'Ethernet184'. Not adding port description Sep 21 12:44:49.461411 sonic INFO lldp#lldpmgrd[36]: port Ethernet184 is not up, continue Sep 21 12:44:49.461411 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set EGRESS_DISABLE, key:LAG_MEMBER [OID:0x1000F0001001B] [log_port:0x1000F, lag_log_id:0x1], val:true Sep 21 12:44:49.482347 sonic NOTICE swss#orchagent: :- setDistributionOnLagMember: Disable distribution on LAG member Ethernet184 Sep 21 12:44:49.482347 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set INGRESS_DISABLE, key:LAG_MEMBER [OID:0x1000F0001001B] [log_port:0x1000F, lag_log_id:0x1], val:true Sep 21 12:44:49.482347 sonic NOTICE swss#orchagent: :- setCollectionOnLagMember: Disable collection on LAG member Ethernet184 Sep 21 12:44:49.489985 sonic NOTICE swss#orchagent: :- updatePortOperStatus: Port PortChannel184 oper state set from up to down Sep 21 12:44:49.489985 sonic NOTICE swss#orchagent: :- flushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2 Sep 21 12:44:49.489985 sonic NOTICE swss#orchagent: :- recordFlushFdbEntries: flush key: SAI_OBJECT_TYPE_FDB_FLUSH:oid:0x21000000000000, fields: 2 Sep 21 12:44:49.508641 sonic NOTICE syncd#SDK: [SAI_FDB.NOTICE] ./src/mlnx_sai_fdb.c[1425]- mlnx_flush_fdb_entries: Create FDB_FLUSH: #0 BRIDGE_PORT_ID=BRIDGE_PORT [OID:0x10000003A] [bridge_ports_db[1]] #1 ENTRY_TYPE=DYNAMIC Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- processFdbFlush: fdb flush succeeded, updating redis database Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x3a0000000009b8, bvId = oid:0x0 Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*, portStr oid:0x3a0000000009b8 Sep 21 12:44:49.514402 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC Sep 21 12:44:49.514402 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.000468 sec Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- handle_fdb_event: got fdb flush event: [{"fdb_entry":"{\"bvid\":\"oid:0x0\",\"mac\":\"00:00:00:00:00:00\",\"switch_id\":\"oid:0x100000021\"}","fdb_event":"SAI_FDB_EVENT_FLUSHED","list":[{"id":"SAI_FDB_ENTRY_ATTR_BRIDGE_PORT_ID","value":"oid:0x10000003a"},{"id":"SAI_FDB_ENTRY_ATTR_TYPE","value":"SAI_FDB_ENTRY_TYPE_DYNAMIC"},{"id":"SAI_FDB_ENTRY_ATTR_PACKET_ACTION","value":"SAI_PACKET_ACTION_FORWARD"}]}] Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- processFlushEvent: received a flush port fdb event, portVid = oid:0x3a0000000009b8, bvId = oid:0x0 Sep 21 12:44:49.514402 sonic NOTICE syncd#SDK: :- processFlushEvent: pattern ASIC_STATE:SAI_OBJECT_TYPE_FDB_ENTRY:*, portStr oid:0x3a0000000009b8 Sep 21 12:44:49.599313 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set PACKET_ACTION, key:ROUTE_ENTRY :: ::, val:DROP Sep 21 12:44:49.599313 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set PACKET_ACTION, key:ROUTE_ENTRY 0.0.0.0 0.0.0.0, val:DROP Sep 21 12:44:49.599313 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[2319]- mlnx_bulk_statuses_print: [BULK_SET] 4 Routes: 4 success, 0 not executed, 0 failed Sep 21 12:44:49.608331 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: processing consolidated fdb flush event of type: SAI_FDB_ENTRY_TYPE_DYNAMIC Sep 21 12:44:49.608331 sonic NOTICE swss#orchagent: :- meta_sai_on_fdb_flush_event_consolidated: fdb flush took 0.000339 sec Sep 21 12:44:49.636590 sonic NOTICE syncd#SDK: [SAI_UTILS.NOTICE] ./src/mlnx_sai_utils.c[1911]- set_dispatch_attrib_handler: Set TYPE, key:FDB_ENTRY [64:9d:99:17:cb:2c] [BV_ID:0x6400000026], val:STATIC Sep 21 12:44:49.637124 sonic ERR syncd#SDK: [SAI_FDB.ERR] ./src/mlnx_sai_fdb.c[446]- mlnx_get_mac: Failed to get 0 fdb entries Entry Not Found. Sep 21 12:44:49.637483 sonic ERR syncd#SDK: [SAI_UTILS.ERR] ./src/mlnx_sai_utils.c[1914]- set_dispatch_attrib_handler: Failed Set TYPE, key:FDB_ENTRY [64:9d:99:17:cb:2c] [BV_ID:0x6400000026], val:STATIC Sep 21 12:44:49.637787 sonic ERR syncd#SDK: [SAI_UTILS.ERR] ./src/mlnx_sai_utils.c[2134]- sai_set_attribute: Failed to set the attribute. Sep 21 12:44:49.638516 sonic ERR syncd#SDK: :- sendApiResponse: api SAI_COMMON_API_SET failed in syncd mode: SAI_STATUS_ITEM_NOT_FOUND Sep 21 12:44:49.638516 sonic ERR syncd#SDK: :- processQuadEvent: attr: SAI_FDB_ENTRY_ATTR_TYPE: SAI_FDB_ENTRY_TYPE_STATIC Sep 21 12:44:49.639377 sonic ERR swss#orchagent: :- set: set status: SAI_STATUS_ITEM_NOT_FOUND Sep 21 12:44:49.640152 sonic ERR swss#orchagent: :- addFdbEntry: macUpdate-Failed for attr.id=0x0 for FDB 64:9d:99:17:cb:2c in Vlan100 on PortChannel2001, rv:-7 Sep 21 12:44:49.640152 sonic ERR swss#orchagent: :- handleSaiSetStatus: Encountered failure in set operation, exiting orchagent, SAI API: SAI_API_FDB, status: SAI_STATUS_ITEM_NOT_FOUND Sep 21 12:44:49.640152 sonic NOTICE swss#orchagent: :- notifySyncd: sending syncd: SYNCD_INVOKE_DUMP Sep 21 12:44:49.640976 sonic NOTICE syncd#SDK: :- processNotifySyncd: Invoking SAI failure dump Sep 21 12:44:49.678566 sonic INFO syncd#SDK: [LOG.INFO] Initializing SX log with user callback function. Sep 21 12:44:49.690028 sonic NOTICE syncd#SDK: [DBG.NOTICE] Started DEBUG-DUMP Sep 21 12:44:49.690996 sonic NOTICE syncd#SDK: [EMAD.NOTICE] setting emad-timeout to 1000000 Sep 21 12:44:50.050560 sonic NOTICE syncd#SDK: :- threadFunction: time span 409 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:51.050690 sonic NOTICE syncd#SDK: :- threadFunction: time span 1409 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:51.696899 sonic INFO systemd[1]: run-docker-runtime\x2drunc-moby-b08e3c24fd0869e6085e9476afff40e4ae88f12512be34edacf1be374a081168-runc.NTc3tx.mount: Succeeded. Sep 21 12:44:52.050877 sonic NOTICE syncd#SDK: :- threadFunction: time span 2409 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:53.051192 sonic NOTICE syncd#SDK: :- threadFunction: time span 3409 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:53.292297 sonic NOTICE syncd#SDK: [EMAD.NOTICE] setting emad-timeout to 20000000 Sep 21 12:44:53.292498 sonic NOTICE syncd#SDK: [DBG.NOTICE] Finished DEBUG-DUMP Sep 21 12:44:54.051070 sonic NOTICE syncd#SDK: :- threadFunction: time span 4410 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:54.939369 sonic NOTICE syncd#SDK: [COMMAND_IFC.NOTICE] COMMAND IFC: COMMAND IFC INIT DONE Sep 21 12:44:54.951600 sonic NOTICE syncd#SDK: [ACCESS_REG_INIT.NOTICE] ACCESS REG: ACCESS REG INIT DONE Sep 21 12:44:55.027014 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] Generating 3 CR-Space dumps Sep 21 12:44:55.027444 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-26845iicobztbn.tmp). Sep 21 12:44:55.027768 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-26845.udmprvqwkoqsm.tmp). Sep 21 12:44:55.028048 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] sxd_fw_dbg_extract FW instruction structure for type:[8] not yet initialized, trying to initialize it now Sep 21 12:44:55.028406 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] lines:[22359] Sep 21 12:44:55.031503 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] parsing done Sep 21 12:44:55.051145 sonic NOTICE syncd#SDK: :- threadFunction: time span 5410 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:55.368815 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump meta file saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-26845] Sep 21 12:44:55.369248 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump file CR saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-26845.udmp] Sep 21 12:44:55.369566 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-368629wcpsgesme.tmp). Sep 21 12:44:55.369830 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-368629.udmprprnbarqb.tmp). Sep 21 12:44:55.370084 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] sxd_fw_dbg_extract FW instruction structure for type:[8] not yet initialized, trying to initialize it now Sep 21 12:44:55.370329 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] lines:[22359] Sep 21 12:44:55.373346 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] parsing done Sep 21 12:44:55.716589 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump meta file saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-368629] Sep 21 12:44:55.717037 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump file CR saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-368629.udmp] Sep 21 12:44:55.717376 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-716450cunjkhyeo.tmp). Sep 21 12:44:55.717687 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] FW dump using temporary file (/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-716450.udmpajsfiuuba.tmp). Sep 21 12:44:55.717956 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] sxd_fw_dbg_extract FW instruction structure for type:[8] not yet initialized, trying to initialize it now Sep 21 12:44:55.718200 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] lines:[22359] Sep 21 12:44:55.721153 sonic NOTICE syncd#SDK: [SXD_FW_DBG.NOTICE] file:[Spectrum.csv] parsing done Sep 21 12:44:56.057476 sonic NOTICE syncd#SDK: :- threadFunction: time span 6416 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:56.072584 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump meta file saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_meta_001-21_09_2023-04_44_55-716450] Sep 21 12:44:56.073155 sonic NOTICE syncd#SDK: [SX_API_DBG.NOTICE] SDK extra dump file CR saved to:[/var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sdkdump_ext_cr_001-21_09_2023-04_44_55-716450.udmp] Sep 21 12:44:56.073761 sonic NOTICE syncd#SDK: [SX_API_DBG_AMBER.NOTICE] AMBER Dump: Starting dump process to file /var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/amber_dump_001-21_09_2023-04_44_56.hex Sep 21 12:44:56.074267 sonic NOTICE syncd#SDK: [EMAD.NOTICE] setting emad-timeout to 1000000 Sep 21 12:44:57.057652 sonic NOTICE syncd#SDK: :- threadFunction: time span 7416 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:57.245762 sonic NOTICE syncd#SDK: [EMAD.NOTICE] setting emad-timeout to 20000000 Sep 21 12:44:57.248532 sonic NOTICE syncd#SDK: [SAI_SWITCH.NOTICE] ./src/mlnx_sai_switch.c[5444]- mlnx_switch_dfw_thread_func: DFW thread got completed async dump. Sep 21 12:44:57.249065 sonic NOTICE syncd#SDK: :- main: The SAI dump is generated to /var/log/sai_failure_dump/sai_sdk_dump_09_21_2023_04_44_AM/sai_sdk_dump_09_21_2023_04_44_AM . Sep 21 12:44:57.488669 sonic INFO syncd#supervisord 2023-09-21 04:44:57,487 INFO reaped unknown pid 608 (exit status 0) Sep 21 12:44:58.057743 sonic NOTICE syncd#SDK: :- threadFunction: time span 8416 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:58.856789 sonic INFO systemd[1]: run-docker-runtime\x2drunc-moby-b08e3c24fd0869e6085e9476afff40e4ae88f12512be34edacf1be374a081168-runc.bqhVH7.mount: Succeeded. Sep 21 12:44:59.057870 sonic NOTICE syncd#SDK: :- threadFunction: time span 9416 ms for 'notify:SYNCD_INVOKE_DUMP' Sep 21 12:44:59.157233 sonic NOTICE swss#orchagent: :- sai_redis_notify_syncd: invoked DUMP succeeded Sep 21 12:45:01.453593 sonic INFO swss#supervisord 2023-09-21 12:45:01,452 INFO exited: orchagent (terminated by SIGABRT (core dumped); not expected) Sep 21 12:45:02.460844 sonic INFO swss#supervisor-proc-exit-listener: Process 'orchagent' exited unexpectedly. Terminating supervisor 'swss' Sep 21 12:45:02.462097 sonic NOTICE swss#supervisor-proc-exit-listener: :- publish: EVENT_PUBLISHED: {"sonic-events-host:process-exited-unexpectedly":{"ctr_name":"swss","process_name":"orchagent","timestamp":"2023-09-21T12:45:02.461532Z"}} Sep 21 12:45:02.465331 sonic INFO swss#supervisord 2023-09-21 12:45:02,464 WARN received SIGTERM indicating exit request Sep 21 12:45:02.466432 sonic INFO swss#supervisord 2023-09-21 12:45:02,465 INFO waiting for supervisor-proc-exit-listener, rsyslogd, portsyncd, coppmgrd, arp_update, ndppd, neighsyncd, vlanmgrd, intfmgrd, portmgrd, buffermgrd, vrfmgrd, nbrmgrd, vxlanmgrd, fdbsyncd, tunnelmgrd, containercfgd to die Sep 21 12:45:02.509755 sonic INFO swss#supervisord 2023-09-21 12:45:02,508 INFO stopped: containercfgd (exit status 143) Sep 21 12:45:02.512769 sonic INFO swss#supervisord 2023-09-21 12:45:02,511 INFO stopped: tunnelmgrd (terminated by SIGTERM) Sep 21 12:45:02.521225 sonic INFO swss#supervisord 2023-09-21 12:45:02,518 INFO stopped: fdbsyncd (terminated by SIGTERM) Sep 21 12:45:02.528158 sonic INFO swss#supervisord 2023-09-21 12:45:02,526 INFO stopped: vxlanmgrd (terminated by SIGTERM) Sep 21 12:45:02.543398 sonic INFO swss#supervisord 2023-09-21 12:45:02,538 INFO stopped: nbrmgrd (terminated by SIGTERM) Sep 21 12:45:02.549870 sonic INFO swss#supervisord 2023-09-21 12:45:02,547 INFO stopped: vrfmgrd (terminated by SIGTERM) Sep 21 12:45:03.555502 sonic INFO swss#supervisord 2023-09-21 12:45:03,554 INFO stopped: buffermgrd (terminated by SIGTERM) Sep 21 12:45:03.558365 sonic INFO swss#supervisord 2023-09-21 12:45:03,557 INFO stopped: portmgrd (terminated by SIGTERM) Sep 21 12:45:03.569065 sonic INFO swss#supervisord 2023-09-21 12:45:03,566 INFO stopped: intfmgrd (terminated by SIGTERM) Sep 21 12:45:03.573656 sonic INFO swss#supervisord 2023-09-21 12:45:03,572 INFO stopped: vlanmgrd (terminated by SIGTERM) Sep 21 12:45:04.579073 sonic INFO swss#supervisord 2023-09-21 12:45:04,575 INFO stopped: neighsyncd (terminated by SIGTERM) Sep 21 12:45:04.579073 sonic INFO swss#supervisord: orchagent Sep 21 12:45:04.579073 sonic INFO swss#supervisord: ndppd (error) Shutting down... Sep 21 12:45:04.579073 sonic INFO swss#supervisord: ndppd (notice) Bye Sep 21 12:45:04.579073 sonic INFO swss#supervisord 2023-09-21 12:45:04,577 INFO stopped: ndppd (exit status 0) Sep 21 12:45:05.583063 sonic INFO swss#supervisord 2023-09-21 12:45:05,580 INFO stopped: arp_update (terminated by SIGTERM) Sep 21 12:45:05.585256 sonic INFO swss#supervisord 2023-09-21 12:45:05,581 INFO waiting for supervisor-proc-exit-listener, rsyslogd, portsyncd, coppmgrd to die Sep 21 12:45:05.586518 sonic INFO swss#supervisord 2023-09-21 12:45:05,585 INFO stopped: coppmgrd (terminated by SIGTERM) Sep 21 12:45:06.198351 sonic INFO systemd[1]: run-docker-runtime\x2drunc-moby-b08e3c24fd0869e6085e9476afff40e4ae88f12512be34edacf1be374a081168-runc.4EMsr5.mount: Succeeded. Sep 21 12:45:07.591907 sonic INFO swss#supervisord 2023-09-21 12:45:07,591 INFO stopped: portsyncd (terminated by SIGTERM) Sep 21 12:45:08.700391 sonic INFO dockerd[589]: time="2023-09-21T12:45:08.699315085+08:00" level=info msg="ignoring event" container=fe2448156c117fd686f851337f1fd88e58e19fa0f97a22fd20bec3823cd7b4f6 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Sep 21 12:45:08.701799 sonic INFO containerd[391]: time="2023-09-21T12:45:08.701079351+08:00" level=info msg="shim disconnected" id=fe2448156c117fd686f851337f1fd88e58e19fa0f97a22fd20bec3823cd7b4f6 Sep 21 12:45:08.702872 sonic INFO containerd[391]: time="2023-09-21T12:45:08.702746606+08:00" level=warning msg="cleaning up after shim disconnected" id=fe2448156c117fd686f851337f1fd88e58e19fa0f97a22fd20bec3823cd7b4f6 namespace=moby Sep 21 12:45:08.704096 sonic INFO containerd[391]: time="2023-09-21T12:45:08.703963317+08:00" level=info msg="cleaning up dead shim" Sep 21 12:45:08.723691 sonic INFO containerd[391]: time="2023-09-21T12:45:08.723543714+08:00" level=warning msg="cleanup warnings time=\"2023-09-21T12:45:08+08:00\" level=info msg=\"starting signal loop\" namespace=moby pid=1180676 runtime=io.containerd.runc.v2\n" Sep 21 12:45:08.778711 sonic INFO systemd[1]: var-lib-docker-overlay2-ac413ee3a30b27f1629731ffae06e1abc8e670180db7345cdf0d02762951ee31-merged.mount: Succeeded. Sep 21 12:45:09.894299 sonic NOTICE root: Stopping swss service... Sep 21 12:45:09.906223 sonic NOTICE root: Locking /tmp/swss-syncd-lock from swss service Sep 21 12:45:09.921137 sonic NOTICE root: Locked /tmp/swss-syncd-lock (10) from swss service Sep 21 12:45:09.969771 sonic NOTICE root: Warm boot flag: swss false. Sep 21 12:45:09.999423 sonic NOTICE root: Fast boot flag: swss false. Sep 21 12:45:10.764508 sonic DEBUG container: container_stop: BEGIN Sep 21 12:45:10.767557 sonic DEBUG container: read_data: config:True feature:swss fields:[('set_owner', 'local'), ('no_fallback_to_local', False), ('state', 'disabled')] val:['local', False, 'enabled'] Sep 21 12:45:10.769056 sonic DEBUG container: read_data: config:False feature:swss fields:[('current_owner', 'none'), ('remote_state', 'none'), ('container_id', '')] val:['none', 'none', ''] Sep 21 12:45:10.770686 sonic DEBUG container: container_stop: swss: set_owner:local current_owner:none remote_state:none docker_id:swss Sep 21 12:45:10.796449 sonic INFO container: docker cmd: stop for swss Sep 21 12:45:10.797910 sonic DEBUG container: container_stop: END Sep 21 12:45:10.904813 sonic NOTICE root: Stopped swss service... Sep 21 12:45:10.938173 sonic INFO swss.sh[1180903]: 1 Sep 21 12:45:10.944386 sonic NOTICE root: Cleared PortInitDone from APPL_DB... Sep 21 12:45:10.959347 sonic NOTICE root: Clearing FAST_RESTART_ENABLE_TABLE flag... Sep 21 12:45:10.998142 sonic INFO swss.sh[1180911]: 0 Sep 21 12:45:11.005403 sonic NOTICE root: Unlocking /tmp/swss-syncd-lock (10) from swss service Sep 21 12:45:11.036538 sonic INFO systemd[1]: Stopping TEAMD container... Sep 21 12:45:11.046041 sonic NOTICE admin: Stopping teamd service... Sep 21 12:45:11.116432 sonic NOTICE admin: Warm boot flag: teamd false. Sep 21 12:45:11.130485 sonic NOTICE admin: Fast boot flag: teamd false. Sep 21 12:45:12.036794 sonic DEBUG container: container_stop: BEGIN Sep 21 12:45:12.039622 sonic DEBUG container: read_data: config:True feature:teamd fields:[('set_owner', 'local'), ('no_fallback_to_local', False), ('state', 'disabled')] val:['local', False, 'enabled'] Sep 21 12:45:12.040500 sonic DEBUG container: read_data: config:False feature:teamd fields:[('current_owner', 'none'), ('remote_state', 'none'), ('container_id', '')] val:['none', 'none', ''] Sep 21 12:45:12.042923 sonic DEBUG container: container_stop: teamd: set_owner:local current_owner:none remote_state:none docker_id:teamd Sep 21 12:45:12.900810 sonic INFO teamd#supervisord 2023-09-21 12:45:12,896 WARN received SIGTERM indicating exit request Sep 21 12:45:12.902232 sonic INFO teamd#supervisord 2023-09-21 12:45:12,901 INFO waiting for supervisor-proc-exit-listener, rsyslogd, teammgrd, teamsyncd, tlm_teamd, containercfgd to die Sep 21 12:45:13.929645 sonic INFO teamd#supervisord 2023-09-21 12:45:13,928 INFO stopped: containercfgd (exit status 143) Sep 21 12:45:14.931958 sonic NOTICE teamd#tlm_teamd: :- main: Exiting Sep 21 12:45:14.933448 sonic NOTICE teamd#tlm_teamd: :- ~TeamdCtlMgr: Exiting. Disconnecting from teamd. LAG 'PortChannel2001' Sep 21 12:45:14.933448 sonic NOTICE teamd#tlm_teamd: :- ~TeamdCtlMgr: Exiting. Disconnecting from teamd. LAG 'PortChannel184' Sep 21 12:45:14.933448 sonic NOTICE teamd#tlm_teamd: :- ~TeamdCtlMgr: Exiting. Disconnecting from teamd. LAG 'PortChannel176' Sep 21 12:45:15.409088 sonic INFO teamd#supervisord 2023-09-21 12:45:15,408 INFO stopped: tlm_teamd (exit status 0) Sep 21 12:45:16.411960 sonic INFO teamd#supervisord 2023-09-21 12:45:16,409 INFO waiting for supervisor-proc-exit-listener, rsyslogd, teammgrd, teamsyncd to die Sep 21 12:45:16.411960 sonic NOTICE teamd#teamsyncd: :- cleanTeamSync: Cleaning up LAG teamd resources ... Sep 21 12:45:16.412416 sonic NOTICE teamd#teamsyncd: :- main: Received SIGTERM Exiting Sep 21 12:45:16.972034 sonic INFO teamd#supervisord 2023-09-21 12:45:16,971 INFO stopped: teamsyncd (exit status 0) Sep 21 12:45:17.973866 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Cleaning up LAGs during shutdown... Sep 21 12:45:17.979810 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port PortChannel176 state down Sep 21 12:45:17.980172 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetTrafficDisable: Set traffic disable for PortChannel176 Sep 21 12:45:17.980363 sonic INFO kernel: [94386.842029] Bridge: port 2(PortChannel176) entered disabled state Sep 21 12:45:17.984475 sonic INFO kernel: [94386.843810] device Ethernet176 left promiscuous mode Sep 21 12:45:17.984781 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Ethernet176 state down Sep 21 12:45:17.985264 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname Ethernet176 index 1687 is_vrf 0 address fe80::526b:4bff:fe9b:82c0 Sep 21 12:45:17.985724 sonic NOTICE iccpd#mclagsyncd: :- setPortIsolate: set port isolate, src port: PortChannel2001, dst port: PortChannel184,Ethernet184,PortChannel176 Sep 21 12:45:17.988320 sonic INFO kernel: [94386.849349] PortChannel176: Port device Ethernet176 removed Sep 21 12:45:17.991548 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel176 pid 28 Sep 21 12:45:17.994364 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname PortChannel176 index 1592 is_vrf 0 address fe80::63f:72ff:fe2f:3800 Sep 21 12:45:17.996502 sonic INFO kernel: [94386.857095] Bridge: port 2(PortChannel176) entered disabled state Sep 21 12:45:17.997733 sonic WARNING iccpd#iccpd: [local_if_destroy.WARN] Destroy interface PortChannel176, 1592 Sep 21 12:45:17.998093 sonic NOTICE iccpd#iccpd: [local_if_del_all_vlan.NOTICE] Remove all VLANs from PortChannel176 Sep 21 12:45:17.998300 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetTrafficDisable: Set traffic enable for PortChannel176 Sep 21 12:45:18.000366 sonic INFO kernel: [94386.860007] device PortChannel176 left promiscuous mode Sep 21 12:45:18.000395 sonic INFO kernel: [94386.860034] Bridge: port 2(PortChannel176) entered disabled state Sep 21 12:45:18.004471 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel184 pid 36 Sep 21 12:45:18.009245 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Sent SIGTERM to port channel PortChannel2001 pid 44 Sep 21 12:45:18.009562 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Waiting for port channel PortChannel2001 pid 44 to stop... Sep 21 12:45:18.030587 sonic INFO kernel: [94386.888742] device Ethernet184 left promiscuous mode Sep 21 12:45:18.030614 sonic INFO kernel: [94386.893048] PortChannel184: Port device Ethernet184 removed Sep 21 12:45:18.030645 sonic NOTICE iccpd#mclagsyncd: :- setPortIsolate: set port isolate, src port: PortChannel2001, dst port: PortChannel184 Sep 21 12:45:18.043298 sonic INFO kernel: [94386.898234] device Ethernet196 left promiscuous mode Sep 21 12:45:18.043323 sonic INFO kernel: [94386.901426] PortChannel2001: Port device Ethernet196 removed Sep 21 12:45:18.043326 sonic INFO kernel: [94386.902726] Bridge: port 4(PortChannel2001) entered disabled state Sep 21 12:45:18.043327 sonic INFO kernel: [94386.904787] device Ethernet192 left promiscuous mode Sep 21 12:45:18.043375 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Ethernet196 state down Sep 21 12:45:18.043375 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname Ethernet196 index 1692 is_vrf 0 address fe80::526b:4bff:fe9b:82c0 Sep 21 12:45:18.043375 sonic NOTICE iccpd#mclagsyncd: :- setPortMacLearnMode: set port mac learn mode, port: Ethernet196, learn-mode: hardware Sep 21 12:45:18.043375 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port PortChannel2001 state down Sep 21 12:45:18.043375 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan10:04:3f:72:2f:38:00 Sep 21 12:45:18.043375 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan100:04:3f:72:2f:38:00 Sep 21 12:45:18.043375 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan100:64:9d:99:17:cb:2c Sep 21 12:45:18.048576 sonic INFO kernel: [94386.910831] PortChannel2001: Port device Ethernet192 removed Sep 21 12:45:18.048626 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Ethernet192 state down Sep 21 12:45:18.048626 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname Ethernet192 index 1691 is_vrf 0 address fe80::526b:4bff:fe9b:82c0 Sep 21 12:45:18.048626 sonic NOTICE iccpd#mclagsyncd: :- setPortMacLearnMode: set port mac learn mode, port: Ethernet192, learn-mode: hardware Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.087773 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:18.094756 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname PortChannel184 index 1601 is_vrf 0 address fe80::63f:72ff:fe2f:3800 Sep 21 12:45:18.103547 sonic INFO kernel: [94386.957158] Bridge: port 3(PortChannel184) entered disabled state Sep 21 12:45:18.103576 sonic INFO kernel: [94386.958188] device PortChannel184 left promiscuous mode Sep 21 12:45:18.103580 sonic INFO kernel: [94386.958210] Bridge: port 3(PortChannel184) entered disabled state Sep 21 12:45:18.103623 sonic WARNING iccpd#iccpd: [local_if_destroy.WARN] Destroy interface PortChannel184, 1601 Sep 21 12:45:18.103623 sonic NOTICE iccpd#iccpd: [local_if_del_all_vlan.NOTICE] Remove all VLANs from PortChannel184 Sep 21 12:45:18.103623 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetTrafficDisable: Set traffic enable for PortChannel184 Sep 21 12:45:18.126455 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_deladdr.NOTICE] l3_proto 0, ifname PortChannel2001 index 1608 is_vrf 0 address fe80::526b:4bff:fe9b:82c0 Sep 21 12:45:18.128311 sonic INFO kernel: [94386.989136] Bridge: port 4(PortChannel2001) entered disabled state Sep 21 12:45:18.128338 sonic INFO kernel: [94386.989724] device PortChannel2001 left promiscuous mode Sep 21 12:45:18.128340 sonic INFO kernel: [94386.989745] Bridge: port 4(PortChannel2001) entered disabled state Sep 21 12:45:18.128566 sonic WARNING iccpd#iccpd: [local_if_destroy.WARN] Destroy interface PortChannel2001, 1608 Sep 21 12:45:18.128651 sonic NOTICE iccpd#iccpd: [ICCP_FSM.NOTICE] scheduler session disconnect handler Sep 21 12:45:18.128853 sonic NOTICE iccpd#iccpd: [ICCP_FSM.NOTICE] CSM socket 17 close, location 1 Sep 21 12:45:18.128889 sonic NOTICE iccpd#iccpd: [ICCP_FSM.NOTICE] ICCP session down: warm reboot no, role standby, sync_state MLACP_STATE_EXCHANGE Sep 21 12:45:18.129269 sonic NOTICE iccpd#iccpd: [update_vlan_if_mac_on_iccp_up.NOTICE] lif name PortChannel2001, up 0 Sep 21 12:45:18.129353 sonic NOTICE iccpd#iccpd: [recover_vlan_if_mac_on_standby.NOTICE] remote system_id not initialised. Sep 21 12:45:18.129565 sonic NOTICE iccpd#iccpd: [recover_vlan_if_mac_on_standby.NOTICE] remote system_id not initialised. Sep 21 12:45:18.129565 sonic NOTICE iccpd#iccpd: [ICCP_FDB.NOTICE] mlacp_mac_msg_queue_reinit clear mac_msg_list pointers in existing MAC entries Sep 21 12:45:18.129749 sonic NOTICE iccpd#iccpd: [local_if_del_all_vlan.NOTICE] Remove all VLANs from PortChannel2001 Sep 21 12:45:18.130065 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan10:04:3f:72:2f:38:00 Sep 21 12:45:18.130324 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan100:04:3f:72:2f:38:00 Sep 21 12:45:18.130548 sonic NOTICE iccpd#mclagsyncd: :- setFdbEntry: del fdb entry from ASIC_DB:key =Vlan100:64:9d:99:17:cb:2c Sep 21 12:45:18.130799 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetIccpState: Set mlag 1 ICCP state to down Sep 21 12:45:18.131063 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdSetSystemId: Set mlag 1 system mac to 50:6b:4b:9b:82:c0 Sep 21 12:45:18.131277 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdDelRemoteIfInfo: Delete mlag 1, remote interface PortChannel176 Sep 21 12:45:18.131492 sonic NOTICE iccpd#mclagsyncd: :- mclagsyncdDelRemoteIfInfo: Delete mlag 1, remote interface PortChannel184 Sep 21 12:45:18.163669 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.163669 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.163669 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:18.167363 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.177880 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.177880 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:18.975330 sonic INFO teamd#supervisord 2023-09-21 12:45:18,973 INFO reaped unknown pid 28 (exit status 0) Sep 21 12:45:18.975330 sonic INFO teamd#supervisord 2023-09-21 12:45:18,974 INFO reaped unknown pid 36 (exit status 0) Sep 21 12:45:18.975330 sonic INFO teamd#supervisord 2023-09-21 12:45:18,974 INFO reaped unknown pid 44 (exit status 0) Sep 21 12:45:18.986955 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Vlan10 state down Sep 21 12:45:18.988649 sonic NOTICE iccpd#iccpd: [iccp_event_handler_obj_input_newlink.NOTICE] Update local port Vlan100 state down Sep 21 12:45:18.997110 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fe80::/64 :: eth0 Sep 21 12:45:18.997179 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: 0.0.0.0/0 192.168.0.1 eth0 Sep 21 12:45:18.997397 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: ::/0 fe80::21b:21ff:febe:2f0 eth0 Sep 21 12:45:18.997857 sonic NOTICE bgp#fpmsyncd: :- onRouteMsg: RouteTable del msg for route with only one nh on eth0/docker0: fd00::/80 :: docker0 Sep 21 12:45:19.013241 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Waiting for port channel PortChannel184 pid 36 to stop... Sep 21 12:45:19.020359 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: Waiting for port channel PortChannel176 pid 28 to stop... Sep 21 12:45:19.020359 sonic NOTICE teamd#teammgrd: :- cleanTeamProcesses: LAGs cleanup is done Sep 21 12:45:19.020359 sonic NOTICE teamd#teammgrd: :- main: Exiting Sep 21 12:45:19.221554 sonic INFO teamd#supervisord 2023-09-21 12:45:19,212 INFO stopped: teammgrd (exit status 0) Sep 21 12:45:20.355124 sonic INFO containerd[391]: time="2023-09-21T12:45:20.353731751+08:00" level=info msg="shim disconnected" id=27273f07b770a3f406ba4c442239b982fff17cd054d2d9aee8b04310c2249fa7 Sep 21 12:45:20.355256 sonic INFO containerd[391]: time="2023-09-21T12:45:20.353822579+08:00" level=warning msg="cleaning up after shim disconnected" id=27273f07b770a3f406ba4c442239b982fff17cd054d2d9aee8b04310c2249fa7 namespace=moby Sep 21 12:45:20.355337 sonic INFO containerd[391]: time="2023-09-21T12:45:20.353849604+08:00" level=info msg="cleaning up dead shim" Sep 21 12:45:20.356658 sonic INFO dockerd[589]: time="2023-09-21T12:45:20.355608603+08:00" level=info msg="ignoring event" container=27273f07b770a3f406ba4c442239b982fff17cd054d2d9aee8b04310c2249fa7 module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Sep 21 12:45:20.395449 sonic INFO containerd[391]: time="2023-09-21T12:45:20.395298825+08:00" level=warning msg="cleanup warnings time=\"2023-09-21T12:45:20+08:00\" level=info msg=\"starting signal loop\" namespace=moby pid=1181807 runtime=io.containerd.runc.v2\n" Sep 21 12:45:20.453520 sonic INFO systemd[1]: var-lib-docker-overlay2-1dc204245c2f4eb4429a5504ff2ef3b996e1596a2e1e3c0c302ea37f9689c9de-merged.mount: Succeeded. Sep 21 12:45:20.515506 sonic INFO container: docker cmd: wait for teamd Sep 21 12:45:20.528427 sonic INFO container: docker cmd: stop for teamd Sep 21 12:45:20.532899 sonic DEBUG container: container_stop: END Sep 21 12:45:20.693741 sonic NOTICE admin: Stopped teamd service... Sep 21 12:45:20.706308 sonic INFO systemd[1]: teamd.service: Succeeded. Sep 21 12:45:20.706454 sonic INFO systemd[1]: Stopped TEAMD container. Sep 21 12:45:20.711028 sonic NOTICE rsyslog_plugin: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-stopped-ctr":{"ctr_name":"TEAMD","timestamp":"2023-09-21T12:45:20.706783Z"}} Sep 21 12:45:20.731952 sonic INFO systemd[1]: Stopping Router advertiser container... Sep 21 12:45:20.752767 sonic NOTICE admin: Stopping radv service... Sep 21 12:45:20.758480 sonic NOTICE healthd: System is not ready - one or more services are not up Sep 21 12:45:20.858399 sonic NOTICE admin: DEVICE_SUBTYPE: , CONFIG_KNOB: Sep 21 12:45:20.870442 sonic NOTICE admin: Warm boot flag: radv false. Sep 21 12:45:20.882539 sonic NOTICE admin: Fast boot flag: radv false. Sep 21 12:45:21.373791 sonic INFO systemd[1]: run-docker-runtime\x2drunc-moby-b08e3c24fd0869e6085e9476afff40e4ae88f12512be34edacf1be374a081168-runc.YSFl4f.mount: Succeeded. Sep 21 12:45:21.859681 sonic DEBUG container: container_stop: BEGIN Sep 21 12:45:21.863155 sonic DEBUG container: read_data: config:True feature:radv fields:[('set_owner', 'local'), ('no_fallback_to_local', False), ('state', 'disabled')] val:['local', False, 'enabled'] Sep 21 12:45:21.865428 sonic DEBUG container: read_data: config:False feature:radv fields:[('current_owner', 'none'), ('remote_state', 'none'), ('container_id', '')] val:['none', 'none', ''] Sep 21 12:45:21.868155 sonic DEBUG container: container_stop: radv: set_owner:local current_owner:none remote_state:none docker_id:radv Sep 21 12:45:22.265046 sonic INFO radv#supervisord 2023-09-21 12:45:22,262 WARN received SIGTERM indicating exit request Sep 21 12:45:22.265046 sonic INFO radv#supervisord 2023-09-21 12:45:22,262 INFO waiting for supervisor-proc-exit-script, rsyslogd, containercfgd to die Sep 21 12:45:23.290400 sonic INFO radv#supervisord 2023-09-21 12:45:23,289 INFO stopped: containercfgd (exit status 143) Sep 21 12:45:24.385870 sonic INFO containerd[391]: time="2023-09-21T12:45:24.385146831+08:00" level=info msg="shim disconnected" id=da036c5ecf286d7bf932731bd657f618ec8fb6193939179ef7bf5ea90179babd Sep 21 12:45:24.385993 sonic INFO containerd[391]: time="2023-09-21T12:45:24.385241765+08:00" level=warning msg="cleaning up after shim disconnected" id=da036c5ecf286d7bf932731bd657f618ec8fb6193939179ef7bf5ea90179babd namespace=moby Sep 21 12:45:24.386096 sonic INFO containerd[391]: time="2023-09-21T12:45:24.385268266+08:00" level=info msg="cleaning up dead shim" Sep 21 12:45:24.386770 sonic INFO dockerd[589]: time="2023-09-21T12:45:24.385724214+08:00" level=info msg="ignoring event" container=da036c5ecf286d7bf932731bd657f618ec8fb6193939179ef7bf5ea90179babd module=libcontainerd namespace=moby topic=/tasks/delete type="*events.TaskDelete" Sep 21 12:45:24.404119 sonic INFO containerd[391]: time="2023-09-21T12:45:24.403953665+08:00" level=warning msg="cleanup warnings time=\"2023-09-21T12:45:24+08:00\" level=info msg=\"starting signal loop\" namespace=moby pid=1181995 runtime=io.containerd.runc.v2\n" Sep 21 12:45:24.453500 sonic INFO systemd[1]: var-lib-docker-overlay2-d999e1bac7d4a8364665ceffcbf4da5b3161d5b25c43b2aee052f111bef78f78-merged.mount: Succeeded. Sep 21 12:45:24.501561 sonic INFO container: docker cmd: wait for radv Sep 21 12:45:24.506965 sonic INFO container: docker cmd: stop for radv Sep 21 12:45:24.508845 sonic DEBUG container: container_stop: END Sep 21 12:45:24.607124 sonic NOTICE admin: Stopped radv service... Sep 21 12:45:24.638623 sonic INFO systemd[1]: radv.service: Succeeded. Sep 21 12:45:24.638777 sonic INFO systemd[1]: Stopped Router advertiser container. Sep 21 12:45:24.642218 sonic NOTICE rsyslog_plugin: :- publish: EVENT_PUBLISHED: {"sonic-events-host:event-stopped-ctr":{"ctr_name":"Router advertiser","timestamp":"2023-09-21T12:45:24.639696Z"}} Sep 21 12:45:24.656412 sonic INFO systemd[1]: Stopping BGP container... Sep 21 12:45:24.677137 sonic NOTICE admin: Stopping bgp service... Sep 21 12:45:24.744531 sonic NOTICE admin: Warm boot flag: bgp false. Sep 21 12:45:24.755566 sonic NOTICE admin: Fast boot flag: bgp false. Sep 21 12:45:25.305180 sonic INFO dbus-daemon[368]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.24' (uid=0 pid=1182091 comm="systemd-analyze plot ")
I notice there are some issue related to crash like #2332. I think this behavior happen randomly regardless if MAC address is static or dynamic.
I also attach dump log file too.
I faced a similar issue, and after some investigation, I discovered these changes that resolved the issue.
I am trying mclag with 2 mellanox SN2410 switch. I build the sonic firmware using master branch of sonic-buildimage with
INCLUDE_ICCPD = y
.Switch 1 mclag configuration:
Switch 2 mclag configuration:
Runtime information for switch 1:
Runtime information for switch 2
Above states show switch-1 role is Active and switch-2 role is Standby. I assume my mc-lag configuration is correct at this moment.
Next I try to shutdown interface
Ethernet184
of switch-2:After a while, few docker container crash and restarted. The mclag peering connections crash too. Those crash container/service will back online after some time.
This is the log captured during the crashing:
I notice there are some issue related to crash like #2332. I think this behavior happen randomly regardless if MAC address is static or dynamic.
I also attach dump log file too.
sai_sdk_dump_09_21_2023_04_47_AM.tar.gz