sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
730 stars 1.4k forks source link

[chassis][voq] Internal BGP session was not established #11268

Closed ysmanman closed 2 years ago

ysmanman commented 2 years ago

Description

We observed the ibgp issue after running sonic-mgmt tests. This was observed in a t2 topo, which has mixed single & multi-asic LCs. Initially, all ibgp sessions were established and up. However, after running certain sonic-mgmt tests, we observed some ibgp session was broken and not able to established.

Steps to reproduce the issue:

  1. Start running all sonic-mgmt tests for topo t0,t2,any
  2. Check test result periodically and will see some tests may fail in pre-test sanity check because ibgp session is down.

Describe the results you received:

In a particular failure case, we observed the ibgp session fro LC6-ASIC0 (neighbor name: cmp206-6, inband addr: 1.1.1.9) to LC4-ASIC0 (neighbor name: cmp206-4-ASIC0, inband addr: 1.1.1.1) was not established.

cmp206-6:~$ show ip bgp summary

Neighbhor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd NeighborName


1.1.1.1 4 65100 21279 13016 0 0 0 never Idle cmp206-4-ASIC0

cmp206-4:~$ show ip bgp summary

Neighbhor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd NeighborName


1.1.1.9 4 65100 21224 30395 0 0 0 2d05h38m 141 cmp206-6

Bgp log @ cmp206-6:

cmp206-6:/# egrep '1.1.1.1' /var/log/messages 2022-06-27T18:14:00.100987+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:14:32.102148+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:15:36.103696+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:17:36.104559+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:17:36.104575+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:17:36.105144+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:17:37.106438+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:17:43.108877+00:00 cmp206-6 bgpd[55]: message repeated 2 times: [ %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes ] 2022-06-27T18:17:51.109986+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:18:07.111618+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:18:39.112998+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:19:43.114245+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:21:43.115840+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:23:43.117288+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:25:43.118604+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:27:43.119655+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:29:43.120898+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:31:43.122213+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:33:43.123692+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:35:43.125001+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:37:43.125971+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:37:43.125986+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:37:43.126602+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:37:44.128073+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:37:50.130675+00:00 cmp206-6 bgpd[55]: message repeated 2 times: [ %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes ] 2022-06-27T18:37:58.132034+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:14.133044+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:14.133053+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:38:14.133388+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:38:15.134966+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:45.141547+00:00 cmp206-6 bgpd[55]: message repeated 4 times: [ %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes ] 2022-06-27T18:39:17.142624+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:17.142638+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:39:17.143190+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:39:18.144550+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:20.145998+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:32.148664+00:00 cmp206-6 bgpd[55]: message repeated 2 times: [ %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes ] 2022-06-27T18:39:48.149700+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:48.149715+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:39:48.150151+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:39:49.151365+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:51.152814+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:55.154374+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:03.155693+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:19.156872+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:19.156888+00:00 cmp206-6 bgpd[55]: [EC 33554465] 1.1.1.1 [FSM] unexpected packet received in state OpenSent
2022-06-27T18:40:19.157530+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: sent to neighbor 1.1.1.1 5/1 (Neighbor Events Error/Receive Unexpected Message in OpenSent State) 0 bytes 2022-06-27T18:40:20.158786+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:22.160120+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:34.162610+00:00 cmp206-6 bgpd[55]: message repeated 2 times: [ %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes ] 2022-06-27T18:40:50.163802+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:41:22.165134+00:00 cmp206-6 bgpd[55]: %NOTIFICATION: received from neighbor 1.1.1.1 6/7 (Cease/Connection collision resolution) 0 bytes

Bgp log @ cmp206-4-ASIC0:

cmp206-4:/# egrep '1.1.1.9' /var/log/messages <...> 2022-06-27T18:38:14.088516+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:14.088536+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:38:15.089853+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:15.089866+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:38:17.091990+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:17.092002+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:38:21.093398+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:21.093468+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:38:29.094637+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:29.095071+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:38:45.095597+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:38:45.095609+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:17.096561+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:17.097264+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:18.098067+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:18.098440+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:20.099496+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:20.099508+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:24.100645+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:24.100848+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:32.101839+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:32.101890+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:48.102726+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:48.102739+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:49.104130+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:49.104143+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:51.105441+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:51.105505+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:39:55.107032+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:39:55.107049+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:03.108152+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:03.108533+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:19.109266+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:19.109829+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:20.110842+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:20.111252+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:22.112022+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:22.112083+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:26.113196+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:26.113206+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:34.114242+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:34.114294+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:40:50.115019+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:40:50.115031+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9
2022-06-27T18:41:22.115584+00:00 cmp206-4 bgpd[63]: %NOTIFICATION: sent to neighbor 1.1.1.9 6/7 (Cease/Connection collision resolution) 0 bytes
2022-06-27T18:41:22.115676+00:00 cmp206-4 bgpd[63]: [EC 33554451] bgp_process_packet: BGP OPEN receipt failed for peer: 1.1.1.9

Describe the results you expected:

Output of show version:

We were testing our internal build image that had synced from master till 1e2e493da.

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

zhangyanzhao commented 2 years ago

Per investigation, it is testbed setup issue. @ysmanman can you please double check and close this issue? Thanks.

ysmanman commented 2 years ago

Per investigation, it is testbed setup issue. @ysmanman can you please double check and close this issue? Thanks.

Hi @zhangyanzhao, yes, we haven't seen the issue anymore in our recent testing. I will close it for now.