sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
200 stars 732 forks source link

[Bug]: loganlyzer error during test decap/test_decap.py #14707

Open congh-nvidia opened 2 months ago

congh-nvidia commented 2 months ago

Issue Description

In the decap test(decap/test_decap.py), when the parameter vxlan=set_unset, it will uses the /usr/local/bin/neighbor_advertiser to set the vxlan tunnel. But sometimes the neighbor_advertiser fails and throws syslog errors which will be captured by loganalyzer:

"2024 Sep 24 06:07:45.719646 r-sn2700-40 ERR neighbor_advertiser: The request failed, vip: 10.215.30.182, error: HTTPSConnectionPool(host='10.215.30.182', port=448): Max retries exceeded with url: /Ferret/NeighborAdvertiser/Slices/r-sn2700-40 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fba715ad890>: Failed to establish a new connection: [Errno 113] No route to host'))"
"2024 Sep 24 06:07:45.719768 r-sn2700-40 ERR neighbor_advertiser: Failed to set up neighbor advertiser slice, tried all vips in 10.215.30.182"

The test fails due to the loganalyzer errors. This is the line that calls the neighbor_advertiser: https://github.com/sonic-net/sonic-mgmt/blob/master/tests/decap/test_decap.py#L188 And in PR(https://github.com/sonic-net/sonic-mgmt/pull/7993), the "module_ignore_errors = True" was added to this line, meaning sometimes the failure is expected and can be ignored. I don't quite understand this because from PR(https://github.com/sonic-net/sonic-mgmt/pull/5834), looks like we need the neighbor_advertiser to be successfully executed to make sure the test gap covered.

My question is, should we just ignore the syslog errors when neighbor_advertiser fails, or we should fix the test or the neighbor_advertiser to make sure it is always successful?

Results you see

I have made some further analysis on the failure of the neighbor_advertiser. Firstly the test starts a ferret daemon in the ptf and listens to the port 448. And then the neighbor_advertiser tries to post a http request to the ferret daemon and expect a good response. When the neighbor_advertiser fails, the tcp connection is not successfully established, the is the tcpdump on the dut, 10.210.25.49 is the dut switch, 10.215.30.182 is the ptf:

09:38:36.867275 IP (tos 0x0, ttl 64, id 32106, offset 0, flags [DF], proto TCP (6), length 60)    ----------------> SYN
    10.210.25.49.46098 > 10.215.30.182.448: Flags [S], cksum 0x4dbe (incorrect -> 0xdbc3), seq 426849427, win 64240, options [mss 1460,sackOK,TS val 2278135833 ecr 0,nop,wscale 9], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  003c 7d6a 4000 4006 6fc2 0ad2 1931 0ad7
    0x0020:  1eb6 b412 01c0 1971 3493 0000 0000 a002
    0x0030:  faf0 4dbe 0000 0204 05b4 0402 080a 87c9
    0x0040:  9819 0000 0000 0103 0309
09:38:36.867616 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 60)             ----------------> SYN, ACK
    10.215.30.182.448 > 10.210.25.49.46098: Flags [S.], cksum 0xdb45 (correct), seq 2117789546, ack 426849428, win 65160, options [mss 1460,sackOK,TS val 117018674 ecr 2278135833,nop,wscale 13], length 0
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  003c 0000 4000 3e06 ef2c 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 b412 7e3a e76a 1971 3494 a012
    0x0030:  fe88 db45 0000 0204 05b4 0402 080a 06f9
    0x0040:  9032 87c9 9819 0103 030d
09:38:36.867649 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 40)             ----------------> RST
    10.210.25.49.46098 > 10.215.30.182.448: Flags [R], cksum 0x5e79 (correct), seq 426849428, win 0, length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0028 0000 4000 4006 ed40 0ad2 1931 0ad7
    0x0020:  1eb6 b412 01c0 1971 3494 0000 0000 5004
    0x0030:  0000 5e79 0000

Here the connection is reset by the dut after the ACK from ferret daemon is received.

This is the tcpdump from a successful request:

09:38:41.524028 IP (tos 0x0, ttl 64, id 41664, offset 0, flags [DF], proto TCP (6), length 60)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [S], cksum 0x4dbe (incorrect -> 0x947e), seq 610694722, win 64240, options [mss 1460,sackOK,TS val 2278140490 ecr 0,nop,wscale 9], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  003c a2c0 4000 4006 4a6c 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7642 0000 0000 a002
    0x0030:  faf0 4dbe 0000 0204 05b4 0402 080a 87c9
    0x0040:  aa4a 0000 0000 0103 0309
09:38:41.524365 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [S.], cksum 0x9e41 (correct), seq 1648092919, ack 610694723, win 65160, options [mss 1460,sackOK,TS val 117023331 ecr 2278140490,nop,wscale 13], length 0
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  003c 0000 4000 3e06 ef2c 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b e6f7 2466 7643 a012
    0x0030:  fe88 9e41 0000 0204 05b4 0402 080a 06f9
    0x0040:  a263 87c9 aa4a 0103 030d
09:38:41.524431 IP (tos 0x0, ttl 64, id 41665, offset 0, flags [DF], proto TCP (6), length 52)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [.], cksum 0x4db6 (incorrect -> 0xcb1e), seq 1, ack 1, win 126, options [nop,nop,TS val 2278140490 ecr 117023331], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0034 a2c1 4000 4006 4a73 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7643 623b e6f8 8010
    0x0030:  007e 4db6 0000 0101 080a 87c9 aa4a 06f9
    0x0040:  a263
09:38:41.571516 IP (tos 0x0, ttl 64, id 41666, offset 0, flags [DF], proto TCP (6), length 569)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [P.], cksum 0x4fbb (incorrect -> 0x05ba), seq 1:518, ack 1, win 126, options [nop,nop,TS val 2278140537 ecr 117023331], length 517
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0239 a2c2 4000 4006 486d 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7643 623b e6f8 8018
    0x0030:  007e 4fbb 0000 0101 080a 87c9 aa79 06f9
    0x0040:  a263 1603 0102 0001 0001 fc03 03d0 c967
    0x0050:  40d7 aa5d dbb5 374a 57e8 7165 b8ba c6cf
    0x0060:  149a 77c7 3b27 0095 0034 13b0 1120 25d5
    0x0070:  2e94 988a e918 3eb9 a47b 95e7 b22d cf24
    0x0080:  8993 26c9 f490 d744 ddee fa32 b9a8 003e
    0x0090:  1302 1303 1301 c02c c030 009f cca9 cca8
    0x00a0:  ccaa c02b c02f 009e c024 c028 006b c023
    0x00b0:  c027 0067 c00a c014 0039 c009 c013 0033
    0x00c0:  009d 009c 003d 003c 0035 002f 00ff 0100
    0x00d0:  0175 000b 0004 0300 0102 000a 0016 0014
    0x00e0:  001d 0017 001e 0019 0018 0100 0101 0102
    0x00f0:  0103 0104 0010 000b 0009 0868 7474 702f
    0x0100:  312e 3100 1600 0000 1700 0000 3100 0000
    0x0110:  0d00 2a00 2804 0305 0306 0308 0708 0808
    0x0120:  0908 0a08 0b08 0408 0508 0604 0105 0106
    0x0130:  0103 0303 0103 0204 0205 0206 0200 2b00
    0x0140:  0504 0304 0303 002d 0002 0101 0033 0026
    0x0150:  0024 001d 0020 c285 d9ed f7b4 d771 5a53
    0x0160:  5cd5 7d20 589b a54f ab72 3243 2d73 122a
    0x0170:  fb7a 5ef7 6e55 0015 00cd 0000 0000 0000
    0x0180:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0190:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01a0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01b0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01c0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01d0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01e0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x01f0:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0200:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0210:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0220:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0230:  0000 0000 0000 0000 0000 0000 0000 0000
    0x0240:  0000 0000 0000 00
09:38:41.571804 IP (tos 0x0, ttl 62, id 30805, offset 0, flags [DF], proto TCP (6), length 52)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [.], cksum 0xc930 (correct), seq 1, ack 518, win 8, options [nop,nop,TS val 117023379 ecr 2278140537], length 0
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  0034 7855 4000 3e06 76df 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b e6f8 2466 7848 8010
    0x0030:  0008 c930 0000 0101 080a 06f9 a293 87c9
    0x0040:  aa79
09:38:41.574786 IP (tos 0x0, ttl 62, id 30806, offset 0, flags [DF], proto TCP (6), length 1533)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [P.], cksum 0x537f (incorrect -> 0x2b03), seq 1:1482, ack 518, win 8, options [nop,nop,TS val 117023382 ecr 2278140537], length 1481
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  05fd 7856 4000 3e06 7115 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b e6f8 2466 7848 8018
    0x0030:  0008 537f 0000 0101 080a 06f9 a296 87c9
    0x0040:  aa79 1603 0300 7a02 0000 7603 0307 5b1e
    0x0050:  e12d 02b4 cec8 3d5e f272 51fe 2e26 32df
    0x0060:  6aa3 26fa 7d77 6075 ef11 a5f9 6a20 25d5
    0x0070:  2e94 988a e918 3eb9 a47b 95e7 b22d cf24
    0x0080:  8993 26c9 f490 d744 ddee fa32 b9a8 1302
    0x0090:  0000 2e00 2b00 0203 0400 3300 2400 1d00
    0x00a0:  2014 a0f2 3df5 9d5f 107f 8025 dbc8 1836
    0x00b0:  2b6b 4e79 b939 a064 d475 a6fe 4cdb 182c
    0x00c0:  3314 0303 0001 0117 0303 0017 1ed9 72c7
    0x00d0:  fe1e 96cc 9d0c bdb7 95f1 c1aa b63f a2cc
    0x00e0:  e24b 7c17 0303 03bb 9b0f 5972 e070 d24d
    0x00f0:  fcca a4ad 0039 8916 9090 2664 fab5 a0b7
    0x0100:  e06b cc54 cbf3 14ee 4bba 380f a9a5 61cf
    0x0110:  411c 08d9 f860 fc8a 9407 cddd 240d 17a9
    0x0120:  fc5e 76fb 4d1e 0734 1d02 e003 c18f c4ac
    0x0130:  59f5 bae1 dceb 82a5 de2f e9c2 47d0 440d
    0x0140:  94bd 658d a1a7 7d52 4d33 a151 91e6 e83b
    0x0150:  9ff7 2a10 a2da 4ad8 9a65 117c 860d 4dfe
    0x0160:  1ceb 7807 9660 bbb9 b61c b9e1 c183 401d
    0x0170:  c5d2 0a8c c294 092f ebbe 5416 7e45 958f
    0x0180:  c0dd 0f32 51fb 3dc3 a8a6 daaf a6d5 81ef
    0x0190:  d0b1 ed1f 0296 bda3 77cd 58de 2161 053a
    0x01a0:  fc02 9ea2 19e2 396a e300 f1d3 b96b c62e
    0x01b0:  59bb 45cf 6d43 48a1 b139 96c8 64b5 0999
    0x01c0:  43ef ea3d c5d1 b784 88d7 4622 7be0 0a6d
    0x01d0:  0854 308a 2e54 4f6b 549d df3e 9562 790f
    0x01e0:  e326 bf75 05e7 4beb 3d18 61b8 6660 72af
    0x01f0:  705e dc26 4493 7418 acaf 0d60 4bee 8d98
    0x0200:  2b9c bfd5 ed83 1264 adac b197 7375 f06a
    0x0210:  f1ff 1ae3 b18d c6a8 8e32 3ddf b826 c176
    0x0220:  b910 7561 b5cd 1c7a dd22 5c19 9a41 b4e1
    0x0230:  c11e 3f2c 3fc3 6dbe b06e c837 a94e 49a1
    0x0240:  b218 68b9 e4b1 6f4a a2b5 3739 ed40 e58a
    0x0250:  7749 856a 1ecc 75ad fa9a e414 d66d 76e2
    0x0260:  d981 e8a0 5537 704b be4c 202e 0e44 e628
    0x0270:  f7d1 89ec 1133 2f5b 67ee 45a5 0dbd b45e
    0x0280:  4678 4ff9 b658 3540 b7da ee99 1d9b 5742
    0x0290:  9e8e 266f 68ad 0544 2365 5423 8d44 68af
    0x02a0:  2be1 fbd0 5852 a90e 6e6f 1aff 91fd ed76
    0x02b0:  67ea 5f60 d2ca fcb4 af6b 68b1 8e42 b23a
    0x02c0:  504b 22af d67b 4863 2484 973d 80f7 ee86
    0x02d0:  9a00 3bde c22b 9407 648d 109d da61 33d1
    0x02e0:  80a1 119b 0952 5b0a bd28 ce04 eb40 ac70
    0x02f0:  3d25 7757 6952 37d5 503d 26be f63c f730
    0x0300:  1f45 6935 1025 6458 3c0b df01 4ccf f26c
    0x0310:  4c6b 3fe4 e540 00f2 8bb4 7a8a 9346 01f1
    0x0320:  0ce5 36f6 1a5a 3722 c586 dd5c 63ea c489
    0x0330:  1c40 6b6f 6353 2a6e ba9e 226c c39e a52b
    0x0340:  2ffe a799 f2b2 c464 6fd6 4917 26d5 d9be
    0x0350:  7243 53df c2e9 dfe9 5a54 e02c 53ad e962
    0x0360:  5553 e097 212e df76 1c56 73ca ff9e da33
    0x0370:  b752 0a61 4bcc e340 f8b6 9669 ad5d 6f40
    0x0380:  a2dc ce95 906a a62d 2708 3371 577c 8156
    0x0390:  18c7 e68b f921 b1cd 360e 7ae9 7560 0a6e
    0x03a0:  b2ca 5330 a7bc 8730 5efd 04f2 ae64 0ac5
    0x03b0:  4eeb e2fe de08 8b43 1778 fa4d 7609 b966
    0x03c0:  0dad cdcb 649d 92f7 110a 33c9 81ff f547
    0x03d0:  0f4d d9e8 6ac4 344c 13ea e9ed f692 fd9b
    0x03e0:  f10f 6c44 b3db eac5 94ef baa3 56d6 3599
    0x03f0:  1e53 3713 447c 5049 f9fc 8bf9 0d8a 7a30
    0x0400:  8733 874e 58c5 cb3b 1f60 83f2 5212 1e60
    0x0410:  ac6f 2306 3e36 8048 1ab2 e2ae c3e2 7194
    0x0420:  448c 6848 d74a a21e 4503 2508 f34e c972
    0x0430:  efa9 790b c168 4259 4fc7 4792 cb28 644f
    0x0440:  0c04 5c4c e485 a95c b171 1390 0ba8 3d18
    0x0450:  9674 89c4 7e25 28b6 21be 84e4 76fd a6b9
    0x0460:  80f6 e409 1475 837b 5dd8 d42d 1fb7 a804
    0x0470:  b981 957c dc42 b15d b7ef 3648 47de 3942
    0x0480:  692d 6574 0071 01d0 629d 7249 dfc9 76ce
    0x0490:  e62d 6e60 dd5d e9f6 b5b5 f5f0 e944 e41b
    0x04a0:  9214 3c17 0303 0119 957b e8a4 8a2a c9b5
    0x04b0:  e1f1 323c c59b cae2 1859 13f1 4024 14f2
    0x04c0:  2676 8a0e a62e 36d0 1bd7 2788 7ed9 49ac
    0x04d0:  365b abc3 cbd0 a66c ca37 8835 b34e 658f
    0x04e0:  bcd6 14be 3d9d 0c10 7e8a 3fe2 7b52 7eed
    0x04f0:  1b38 7975 bf4c ef3f 0a07 514a 9d53 1d6b
    0x0500:  d321 40b2 9f1d 3a3d 9480 28c8 54c4 1c37
    0x0510:  0cc0 3ceb d8b1 6284 54e0 6b8b 66ef 7f8d
    0x0520:  3874 434f 120d b06b 23af 6eae 6af2 43d7
    0x0530:  6a1a b040 b163 45a3 7fd1 52c5 1339 f20e
    0x0540:  d4d6 2542 4a73 770a dfb6 2a92 1284 adbf
    0x0550:  4200 433e 97a8 d611 9bb3 0eec 333f 0c0a
    0x0560:  573c 6916 d8b5 96a8 385e 3a5f 6d14 e99c
    0x0570:  84ca e008 140b 8ff6 ace1 014a 56db 6410
    0x0580:  9273 1db7 904d f443 4c7d dbe3 8426 6295
    0x0590:  db59 f689 9468 060b ee3b 3ec7 68f9 6a6d
    0x05a0:  249c eb6b 68d6 f4b0 0bd3 6120 e0d7 377b
    0x05b0:  3266 f7e5 f69c 28d4 f862 ed7f 6369 8aba
    0x05c0:  6817 0303 0045 e327 deda b3b0 ada4 89d4
    0x05d0:  fd77 06f4 d181 155d 3792 918d 1b9d 303e
    0x05e0:  6c67 b791 e0d9 dd83 b267 0fb4 4995 7a45
    0x05f0:  9e2a 6c17 bae8 41d6 b603 e94a eaa7 3e93
    0x0600:  d467 09b3 e862 6b33 46c5 18
09:38:41.574831 IP (tos 0x0, ttl 64, id 41667, offset 0, flags [DF], proto TCP (6), length 52)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [.], cksum 0x4db6 (incorrect -> 0xc2ed), seq 518, ack 1482, win 124, options [nop,nop,TS val 2278140540 ecr 117023382], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0034 a2c3 4000 4006 4a71 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7848 623b ecc1 8010
    0x0030:  007c 4db6 0000 0101 080a 87c9 aa7c 06f9
    0x0040:  a296
09:38:41.576420 IP (tos 0x0, ttl 64, id 41668, offset 0, flags [DF], proto TCP (6), length 132)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [P.], cksum 0x4e06 (incorrect -> 0xf161), seq 518:598, ack 1482, win 126, options [nop,nop,TS val 2278140542 ecr 117023382], length 80
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0084 a2c4 4000 4006 4a20 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7848 623b ecc1 8018
    0x0030:  007e 4e06 0000 0101 080a 87c9 aa7e 06f9
    0x0040:  a296 1403 0300 0101 1703 0300 457f ad6a
    0x0050:  81c0 44a1 e655 95aa 0564 4263 2ab5 6ca5
    0x0060:  3491 5713 7517 bbf3 0b3b deeb 1b18 f224
    0x0070:  2c4b 1634 67ae 3541 50bb f433 437b d959
    0x0080:  7448 e163 59da 61af d313 0f07 10df 98f7
    0x0090:  f84b
09:38:41.576741 IP (tos 0x0, ttl 64, id 41669, offset 0, flags [DF], proto TCP (6), length 321)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [P.], cksum 0x4ec3 (incorrect -> 0x065b), seq 598:867, ack 1482, win 126, options [nop,nop,TS val 2278140542 ecr 117023382], length 269
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0141 a2c5 4000 4006 4962 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7898 623b ecc1 8018
    0x0030:  007e 4ec3 0000 0101 080a 87c9 aa7e 06f9
    0x0040:  a296 1703 0301 084a 8452 59be e031 cef3
    0x0050:  a8e6 e3e5 2db1 4bdb f13f 3f26 9928 617e
    0x0060:  c048 21dc d25d 22d6 7d11 8132 7bec 08d5
    0x0070:  a436 18fe b2d0 d6ba 7ca8 b459 ab49 cce0
    0x0080:  0bcd 2e61 ab4f af52 a5f7 b212 b0aa c764
    0x0090:  f018 9eea 574a bd8c 4c19 6bd4 9343 5b0b
    0x00a0:  be82 b084 a178 49c9 d3bf 5e61 813e 9b2a
    0x00b0:  40c5 8003 e689 c225 ebe4 ea96 8378 43d9
    0x00c0:  c383 3aee 32a7 8e8e 9bef 6c3e 548c 75e8
    0x00d0:  c069 1a1a 84af 24f5 678b ba56 9678 af89
    0x00e0:  d316 d4c8 8a90 5c36 f350 7544 9187 3aa2
    0x00f0:  fe76 cebd 8b33 c1f7 e95e 8670 d637 989f
    0x0100:  b9fb 60ed 6dd7 5b28 a3dc 27f3 dc64 86c5
    0x0110:  4ffd 0948 dedc b843 c877 b0ab 1f66 0594
    0x0120:  6c82 f385 71e2 63e0 3894 c928 8e12 70e8
    0x0130:  c2a2 1677 e6ac fb31 d3c4 c47a 89ac 142f
    0x0140:  d372 f188 e552 a4a7 708a ebd7 58bd 29
09:38:41.576777 IP (tos 0x0, ttl 64, id 41670, offset 0, flags [DF], proto TCP (6), length 607)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [P.], cksum 0x4fe1 (incorrect -> 0x9561), seq 867:1422, ack 1482, win 126, options [nop,nop,TS val 2278140542 ecr 117023382], length 555
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  025f a2c6 4000 4006 4843 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 79a5 623b ecc1 8018
    0x0030:  007e 4fe1 0000 0101 080a 87c9 aa7e 06f9
    0x0040:  a296 1703 0302 2674 dd1b ffd6 4542 4e8f
    0x0050:  75ec dd9c d513 2b29 c07b 790e 3495 f159
    0x0060:  b312 9420 6544 8f82 8b84 eeca 7a7f 9637
    0x0070:  617d 84c4 2574 89d2 a115 8031 a95a 3da1
    0x0080:  6792 ef6e f142 907d 49f8 be81 f610 8db8
    0x0090:  5c7e 8885 840c 85fb 6185 0b47 a02e 76b4
    0x00a0:  c5ca 9a79 7716 2413 7c81 f109 e93e a24e
    0x00b0:  524f 09fe 3de0 0bb6 a888 e09e f1c1 5dc9
    0x00c0:  e08a d03e c8a2 5c96 1487 8e61 bdc9 fd36
    0x00d0:  0742 7d77 d3d7 c0d3 1917 3462 707d 861b
    0x00e0:  8440 396d 420d aef1 f9c7 09ab fa26 d672
    0x00f0:  c7a7 ac60 2f77 a5b0 2afb 78c6 5ed3 7f33
    0x0100:  b443 a685 a5d4 655a d532 5353 f91e b812
    0x0110:  4cfd 80ca 0607 fe59 6f1d 7ceb 975f cf40
    0x0120:  10f6 f23e 562a a006 4283 3cbf 8581 eb37
    0x0130:  e61b 6a24 863b 52c9 742d 7aa6 8599 784e
    0x0140:  3480 d766 79a1 5b99 240a 3b43 6cb7 f15b
    0x0150:  93ff bf2c 39b0 00db c16f 613a 2f18 7fd9
    0x0160:  fdc8 af2f 3a10 e157 aa0b 9ddc 37e4 aca0
    0x0170:  0ae4 0ef9 a24a 6bbe f447 e7ce 955e 7179
    0x0180:  e068 2225 a57b 2e65 2b83 884c d82c 25a6
    0x0190:  9077 5698 ac2e 5ae6 277f 1536 e9d8 162d
    0x01a0:  16e0 1bd0 0984 ca44 c250 3bc2 38ce a6b0
    0x01b0:  9a52 31a0 4750 db36 f76f 99ea 2679 2a17
    0x01c0:  85ef 420f 7139 d0f0 c737 6526 b3be 4657
    0x01d0:  fe3a 685d b2d5 cc89 32fa 1788 41c5 0ee3
    0x01e0:  33fe b0fa baaf 3051 58bd a478 bd6e aa79
    0x01f0:  8155 609e 41f0 29c6 f6d0 a554 3e21 8a76
    0x0200:  c0ce 606b f0ea 738c 5545 db27 4c5e f688
    0x0210:  9a80 3a84 166d 1a92 0128 fb53 05d0 c9d0
    0x0220:  0bf4 0c7c 5de1 facb b103 d934 0cad 8c96
    0x0230:  efd0 8ede 27f5 7973 55a4 0afb 5bb6 b2e5
    0x0240:  1c77 2797 238e bd49 2c06 fb24 a107 875e
    0x0250:  0dee c7b5 1a93 91f1 b3e8 8894 8f92 7d7a
    0x0260:  8dd3 e818 7847 25bb 1566 4b27 31
09:38:41.576874 IP (tos 0x0, ttl 62, id 30808, offset 0, flags [DF], proto TCP (6), length 307)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [P.], cksum 0x911d (correct), seq 1482:1737, ack 598, win 8, options [nop,nop,TS val 117023384 ecr 2278140542], length 255
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  0133 7858 4000 3e06 75dd 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b ecc1 2466 7898 8018
    0x0030:  0008 911d 0000 0101 080a 06f9 a298 87c9
    0x0040:  aa7e 1703 0300 fa0c 115a b27f 0d95 df6a
    0x0050:  5267 9064 c878 31e6 3904 37ad 1fcb 508a
    0x0060:  9129 c060 c601 4259 9a4b 1845 3b07 dd0a
    0x0070:  9537 b26e 94d6 8192 8c49 20f9 6720 9ed5
    0x0080:  09a3 ce10 1ab4 0db0 db55 b7cc 2dd4 e759
    0x0090:  2070 e900 7d9a 93db 1a41 337c 8391 6c60
    0x00a0:  3a52 f8c1 e860 7ebc 3d55 93fe 0e61 64e8
    0x00b0:  c6d9 86db 2b27 2aa1 5ec7 387f 6747 37c9
    0x00c0:  e29a e529 4f4d b8ae 80ee 4248 ada6 428f
    0x00d0:  84f6 27f4 c2ad e54f f1e6 fee1 4a0a 7af1
    0x00e0:  df19 bbb8 a970 c88f 0db0 f3d5 8d31 7de5
    0x00f0:  e7a3 805b 28cd ace6 6043 439f 0d6d c652
    0x0100:  4442 c956 694c 8589 4d0b 3322 4457 e7da
    0x0110:  95fa 6376 effe 19e2 1128 0f18 38c6 61d9
    0x0120:  da36 d5d3 8227 0c4e 52f2 adde fd5e 19ad
    0x0130:  95fe c14a 6c0f cedb 2dc8 0e0f 08a9 c1a7
    0x0140:  0c
09:38:41.576940 IP (tos 0x0, ttl 62, id 30809, offset 0, flags [DF], proto TCP (6), length 52)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [.], cksum 0xbed6 (correct), seq 1737, ack 1422, win 8, options [nop,nop,TS val 117023384 ecr 2278140542], length 0
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  0034 7859 4000 3e06 76db 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b edc0 2466 7bd0 8010
    0x0030:  0008 bed6 0000 0101 080a 06f9 a298 87c9
    0x0040:  aa7e
09:38:41.578115 IP (tos 0x0, ttl 62, id 30810, offset 0, flags [DF], proto TCP (6), length 569)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [FP.], cksum 0x9ad7 (correct), seq 1737:2254, ack 1422, win 8, options [nop,nop,TS val 117023385 ecr 2278140542], length 517
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  0239 785a 4000 3e06 74d5 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b edc0 2466 7bd0 8019
    0x0030:  0008 9ad7 0000 0101 080a 06f9 a299 87c9
    0x0040:  aa7e 1703 0300 fa30 02f7 a3a9 cfa0 ac64
    0x0050:  a160 5020 759e 53ae bb8f f584 a5fb 6b2a
    0x0060:  148c 2747 127e 8127 73fb 631b cb2b 165e
    0x0070:  f889 92b9 55a3 2e5a ba05 6e56 23ac db93
    0x0080:  4578 cfe6 eefd 2233 07a8 35fe ffd8 6b5d
    0x0090:  ce8b fabd 6228 845f 2ba2 4416 c936 ca82
    0x00a0:  de8a 1460 fad1 e555 a64f 1457 3c78 830f
    0x00b0:  93b0 7719 d64d f1d7 1846 bccc 13e1 0dd9
    0x00c0:  3884 d8e1 8bf2 fa1c 6f4b a70f 1c33 0b5f
    0x00d0:  8056 6ad8 23e6 a4ee 571a 5d73 66d0 91b6
    0x00e0:  a157 be41 17e5 4bda 107d c9dc 30e9 d289
    0x00f0:  2e72 4f83 1d74 90d8 e34d 4739 a6ab 5a03
    0x0100:  8e90 243d 06f4 9cae d312 9a78 771c 3f75
    0x0110:  2d80 40fb 5c86 cf96 18bb c394 5d48 5880
    0x0120:  bfc5 d2db bbdd e724 45a9 ad82 b28d 28bc
    0x0130:  e31c e878 0d66 28ec e654 27a7 54a1 f335
    0x0140:  a817 0303 00d0 c1ad 9818 8efb 50d5 3beb
    0x0150:  ba53 b3b4 635b 760a fa86 4f2a 3bf8 d0ce
    0x0160:  197e fe6d a017 dafd 9db5 adc5 a95e c5d3
    0x0170:  7fe7 f5c1 50d5 88a6 3ced bbff 0f2e 087a
    0x0180:  196e c945 a04a 3151 15b3 494f b837 543b
    0x0190:  85ed 4990 044c 6daa 7765 6d76 e602 c745
    0x01a0:  de67 e507 f584 131c 963f 5b0c f9b5 8860
    0x01b0:  3fef cc06 5215 fe0a a06d 3976 17fa 4203
    0x01c0:  9941 f031 325a dd5b 5f1f 3658 dcab 4c55
    0x01d0:  5f13 dfac 309d b5fb 5283 50c7 f397 ccf6
    0x01e0:  f2cd ec2c 5a0d a58c 733a 882d 9f9b 226f
    0x01f0:  496d 8761 f473 6e92 062d 176c d9cf 1a2b
    0x0200:  8351 49bd 44a7 ef2f 31d4 d25c a841 1f76
    0x0210:  378f bd92 8fa4 1703 0300 2c8e 5253 ac60
    0x0220:  90bc 8238 04cf d7eb da96 5ccf 68cc e7b3
    0x0230:  b4ac 480f f660 7d8d bbb6 47c2 2c6e a8bf
    0x0240:  1504 4fdc cccb 68
09:38:41.578788 IP (tos 0x0, ttl 64, id 41671, offset 0, flags [DF], proto TCP (6), length 52)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [.], cksum 0x4db6 (incorrect -> 0xbc58), seq 1422, ack 2255, win 126, options [nop,nop,TS val 2278140544 ecr 117023384], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0034 a2c7 4000 4006 4a6d 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7bd0 623b efc6 8010
    0x0030:  007e 4db6 0000 0101 080a 87c9 aa80 06f9
    0x0040:  a298
09:38:41.578861 IP (tos 0x0, ttl 64, id 41672, offset 0, flags [DF], proto TCP (6), length 52)
    10.210.25.49.40066 > 10.215.30.182.448: Flags [F.], cksum 0x4db6 (incorrect -> 0xbc57), seq 1422, ack 2255, win 126, options [nop,nop,TS val 2278140544 ecr 117023384], length 0
    0x0000:  0000 5e00 0101 a088 c286 3ebc 0800 4500
    0x0010:  0034 a2c8 4000 4006 4a6c 0ad2 1931 0ad7
    0x0020:  1eb6 9c82 01c0 2466 7bd0 623b efc6 8011
    0x0030:  007e 4db6 0000 0101 080a 87c9 aa80 06f9
    0x0040:  a298
09:38:41.579038 IP (tos 0x0, ttl 62, id 0, offset 0, flags [DF], proto TCP (6), length 52)
    10.215.30.182.448 > 10.210.25.49.40066: Flags [.], cksum 0xbccb (correct), seq 2255, ack 1423, win 8, options [nop,nop,TS val 117023386 ecr 2278140544], length 0
    0x0000:  a088 c286 3ebc 8851 fb34 ad00 0800 4500
    0x0010:  0034 0000 4000 3e06 ef34 0ad7 1eb6 0ad2
    0x0020:  1931 01c0 9c82 623b efc6 2466 7bd1 8010
    0x0030:  0008 bccb 0000 0101 080a 06f9 a29a 87c9
    0x0040:  aa80

Results you expected to see

The syslog errors should be ignored or the neighbor_advertiser should be always successful.

Is it platform specific

generic

Relevant log output

No response

Output of show version

SONiC Software Version: SONiC.202405_RC.19-5356bf349_Internal
SONiC OS Version: 12
Distribution: Debian 12.7
Kernel: 6.1.0-22-2-amd64
Build commit: 5356bf349
Build date: Mon Sep 23 05:58:44 UTC 2024
Built by: sw-r2d2-bot@r-build-sonic-ci03-242

Platform: x86_64-mlnx_msn2700a1-r0
HwSKU: Mellanox-SN2700-A1-D48C8
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2339XZ0GL2
Model Number: MSN2700-CS2R2OS
Hardware Revision: A2
Uptime: 11:25:20 up  8:53,  1 user,  load average: 0.27, 0.59, 0.62
Date: Tue 24 Sep 2024 11:25:20

Attach files (if any)

No response

congh-nvidia commented 2 months ago

@XuChen-MSFT Could you please review this issue? Thanks.

bingwang-ms commented 1 month ago

@vaibhavhd @ryanzhu706 Can you please help take a look?

vaibhavhd commented 1 month ago

My question is, should we just ignore the syslog errors when neighbor_advertiser fails, or we should fix the test or the neighbor_advertiser to make sure it is always successful?

We should not ignore the errors from neighbor_advertiser that you posted in the description.

Please reach out to Xu to understand why errors were ignored.

congh-nvidia commented 1 month ago

Hi @XuChen-MSFT , could you please explain why errors were ignored in your PR(https://github.com/sonic-net/sonic-mgmt/pull/7993) ?

bingwang-ms commented 1 month ago

@XuChen-MSFT Do we see similar failure in our test? From what I saw, it looks like a testbed/network issue to me

XuChen-MSFT commented 3 weeks ago

@congh-nvidia , in before, it occasionally failed, and don't find any impact after checking. so consider it as flaky error, and then ignore it. no special reason for ignoring.

congh-nvidia commented 2 weeks ago

Hi @XuChen-MSFT, from my understanding, we should not ignore the neighbor_advertiser error, the neighbor_advertiser is just what this test case wants to validate. This is the comment in the test:

# checking decap after vxlan set/unset is to make sure that deletion of vxlan
# tunnel and CPA ACLs won't negatively impact ipinip tunnel & decap mechanism
# Hence a new decap config is not applied to the device in this case. This is
# to avoid creating new tables and test ipinip decap with default loaded config

So look like we should remove the error ignores and investigate why the neighbor_advertiser sometimes fails. @vaibhavhd please correct me if I'm wrong.

bingwang-ms commented 2 weeks ago

@vaibhavhd Can you please comment?