sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
711 stars 1.36k forks source link

[TD3] Egress packet drop in test_vlan #6290

Closed bingwang-ms closed 3 years ago

bingwang-ms commented 3 years ago

Description The test case test_vlan is consistently failing on 7050cx3. I did some investigation and found that packets are dropped on egress side. Test case test_vlan do following things:

  1. Create two new vlans (100 and 200) for testing and add some interfaces into them
    +-----------+------------------+-----------------+----------------+-----------------------+-------------+
    |   VLAN ID | IP Address       | Ports           | Port Tagging   | DHCP Helper Address   | Proxy ARP   |
    +===========+==================+=================+================+=======================+=============+
    |       100 | 192.168.100.1/24 | Ethernet4       | tagged         |                       | disabled    |
    |           |                  | Ethernet8       | untagged       |                       |             |
    |           |                  | Ethernet92      | tagged         |                       |             |
    |           |                  | Ethernet96      | untagged       |                       |             |
    |           |                  | PortChannel0001 | untagged       |                       |             |
    |           |                  | PortChannel0003 | tagged         |                       |             |
    +-----------+------------------+-----------------+----------------+-----------------------+-------------+
    |       200 | 192.168.200.1/24 | Ethernet4       | untagged       |                       | disabled    |
    |           |                  | Ethernet8       | tagged         |                       |             |
    |           |                  | Ethernet92      | untagged       |                       |             |
    |           |                  | Ethernet96      | tagged         |                       |             |
    |           |                  | PortChannel0001 | tagged         |                       |             |
    |           |                  | PortChannel0003 | untagged       |                       |             |
    +-----------+------------------+-----------------+----------------+-----------------------+-------------+
  2. Inject some broadcast icmp packet (tagged and untagger) from ptf to one of the PortChannels (Ethernet112)
  3. Capture traffic on ptf and check if tags on packets are expected.

However, I found that packets are dropped on DUT at egress side.

root@str2-7050cx3-acs-01:~# portstat 
Last cached time was 2020-12-24 09:19:25.176804
      IFACE    STATE    RX_OK       RX_BPS    RX_UTIL    RX_ERR    RX_DRP    RX_OVR    TX_OK     TX_BPS    TX_UTIL    TX_ERR    TX_DRP    TX_OVR
-----------  -------  -------  -----------  ---------  --------  --------  --------  -------  ---------  ---------  --------  --------  --------
  Ethernet4        U        0     0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0       100         0
  Ethernet8        U        0     0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0       100         0
 Ethernet92        U        0     0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0       100         0
 Ethernet96        U        0     0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0       100         0
Ethernet112        U      100  1229.90 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0         0         0
Ethernet120        U        0     0.00 B/s      0.00%         0         0         0        0   0.00 B/s      0.00%         0       100         0

To make it more obvious, I injected 100 packets into Ethernet112 and we can see that all packets were dropped at all vlan interfaces tx. Test plan https://github.com/Azure/sonic-mgmt/blob/662511e2f95cd873e92af9898a1c05aede0d4dab/docs/testplan/VLAN-trunk-test-plan.md

Steps to reproduce the issue:

  1. Run test_vlan

Describe the results you received: The test case should pass.

Describe the results you expected: Test case failed because packets are dropped.

Additional information you deem important (e.g. issue happens only occasionally):

**Output of `show version`:**
SONiC Software Version: SONiC.HEAD.169-c146eeaa
Distribution: Debian 10.7
Kernel: 4.19.0-9-2-amd64
Build commit: c146eeaa
Build date: Tue Dec 22 14:07:33 UTC 2020
Built by: johnar@jenkins-worker-22

Platform: x86_64-arista_7050cx3_32s
HwSKU: Arista-7050CX3-32S-C32
ASIC: broadcom
ASIC Count: 1
Serial Number: JPE20033081
Uptime: 09:45:22 up  6:20,  2 users,  load average: 0.43, 1.83, 2.18

Docker images:
REPOSITORY                    TAG                 IMAGE ID            SIZE
docker-sonic-telemetry        HEAD.169-c146eeaa   d6f316245613        555MB
docker-sonic-telemetry        latest              d6f316245613        555MB
docker-syncd-brcm             HEAD.169-c146eeaa   6cf635032824        726MB
docker-syncd-brcm             latest              6cf635032824        726MB
docker-snmp                   HEAD.169-c146eeaa   e76565656569        525MB
docker-snmp                   latest              e76565656569        525MB
docker-teamd                  HEAD.169-c146eeaa   da0d0d5b3a44        524MB
docker-teamd                  latest              da0d0d5b3a44        524MB
docker-sonic-mgmt-framework   HEAD.169-c146eeaa   55718ad9e66b        641MB
docker-sonic-mgmt-framework   latest              55718ad9e66b        641MB
docker-router-advertiser      HEAD.169-c146eeaa   2558e1230a8d        480MB
docker-router-advertiser      latest              2558e1230a8d        480MB
docker-platform-monitor       HEAD.169-c146eeaa   ddd90e632b2f        605MB
docker-platform-monitor       latest              ddd90e632b2f        605MB
docker-lldp                   HEAD.169-c146eeaa   9d9b758fff1f        520MB
docker-lldp                   latest              9d9b758fff1f        520MB
docker-dhcp-relay             HEAD.169-c146eeaa   ee7e2f914bfa        487MB
docker-dhcp-relay             latest              ee7e2f914bfa        487MB
docker-database               HEAD.169-c146eeaa   cf96413b03e7        480MB
docker-database               latest              cf96413b03e7        480MB
docker-orchagent              HEAD.169-c146eeaa   425d0ad7b0da        554MB
docker-orchagent              latest              425d0ad7b0da        554MB
docker-nat                    HEAD.169-c146eeaa   1ef25954b9f9        526MB
docker-nat                    latest              1ef25954b9f9        526MB
docker-fpm-frr                HEAD.169-c146eeaa   2f134a3a81ce        539MB
docker-fpm-frr                latest              2f134a3a81ce        539MB
docker-sflow                  HEAD.169-c146eeaa   cc3ad45aa34c        523MB
docker-sflow                  latest              cc3ad45aa34c        523MB
**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
prsunny commented 3 years ago

Is it due to incorrect mmu setting? @lguohan ? Configs looks correct to me.

daall commented 3 years ago

Can we check the queue counters on the interfaces that are seeing drops?

bingwang-ms commented 3 years ago

The queue counters are

admin@str2-7050cx3-acs-01:~$ show queue counters Ethernet4
     Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
---------  -----  --------------  ---------------  -----------  ------------
Ethernet4    UC0             419            78729            0             0
Ethernet4    UC1               1              126            0             0
Ethernet4    UC2               0                0            0             0
Ethernet4    UC3               0                0            0             0
Ethernet4    UC4               0                0            0             0
Ethernet4    UC5               0                0            0             0
Ethernet4    UC6               0                0            0             0
Ethernet4    UC7               0                0            0             0
Ethernet4    UC8               0                0            0             0
Ethernet4    UC9               0                0            0             0
Ethernet4   MC10               0                0           48          3072
Ethernet4   MC11               0                0          260         26992
Ethernet4   MC12               0                0            0             0
Ethernet4   MC13               0                0            0             0
Ethernet4   MC14               0                0            0             0
Ethernet4   MC15               0                0            0             0
Ethernet4   MC16               0                0            0             0
Ethernet4   MC17               0                0            0             0
Ethernet4   MC18               0                0            0             0
Ethernet4   MC19               0                0            0             0

admin@str2-7050cx3-acs-01:~$ show queue counters Ethernet8
     Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
---------  -----  --------------  ---------------  -----------  ------------
Ethernet8    UC0             436            80350            0             0
Ethernet8    UC1               1              126            0             0
Ethernet8    UC2               0                0            0             0
Ethernet8    UC3               0                0            0             0
Ethernet8    UC4               0                0            0             0
Ethernet8    UC5               0                0            0             0
Ethernet8    UC6               0                0            0             0
Ethernet8    UC7               0                0            0             0
Ethernet8    UC8               0                0            0             0
Ethernet8    UC9               0                0            0             0
Ethernet8   MC10               0                0           54          3456
Ethernet8   MC11               0                0          273         28182
Ethernet8   MC12               0                0            0             0
Ethernet8   MC13               0                0            0             0
Ethernet8   MC14               0                0            0             0
Ethernet8   MC15               0                0            0             0
Ethernet8   MC16               0                0            0             0
Ethernet8   MC17               0                0            0             0
Ethernet8   MC18               0                0            0             0
Ethernet8   MC19               0                0            0             0

admin@str2-7050cx3-acs-01:~$ show queue counters Ethernet68
      Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
----------  -----  --------------  ---------------  -----------  ------------
Ethernet68    UC0             437            81128            0             0
Ethernet68    UC1               1              126            0             0
Ethernet68    UC2               0                0            0             0
Ethernet68    UC3               0                0            0             0
Ethernet68    UC4               0                0            0             0
Ethernet68    UC5               0                0            0             0
Ethernet68    UC6               0                0            0             0
Ethernet68    UC7               0                0            0             0
Ethernet68    UC8               0                0            0             0
Ethernet68    UC9               0                0            0             0
Ethernet68   MC10               0                0           54          3456
Ethernet68   MC11               0                0          273         28182
Ethernet68   MC12               0                0            0             0
Ethernet68   MC13               0                0            0             0
Ethernet68   MC14               0                0            0             0
Ethernet68   MC15               0                0            0             0
Ethernet68   MC16               0                0            0             0
Ethernet68   MC17               0                0            0             0
Ethernet68   MC18               0                0            0             0
Ethernet68   MC19               0                0            0             0

admin@str2-7050cx3-acs-01:~$ show queue counters Ethernet96
      Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
----------  -----  --------------  ---------------  -----------  ------------
Ethernet96    UC0             436            81038            0             0
Ethernet96    UC1               1              126            0             0
Ethernet96    UC2               0                0            0             0
Ethernet96    UC3               0                0            0             0
Ethernet96    UC4               0                0            0             0
Ethernet96    UC5               0                0            0             0
Ethernet96    UC6               0                0            0             0
Ethernet96    UC7               0                0            0             0
Ethernet96    UC8               0                0            0             0
Ethernet96    UC9               0                0            0             0
Ethernet96   MC10               0                0           54          3456
Ethernet96   MC11               0                0          273         28182
Ethernet96   MC12               0                0            0             0
Ethernet96   MC13               0                0            0             0
Ethernet96   MC14               0                0            0             0
Ethernet96   MC15               0                0            0             0
Ethernet96   MC16               0                0            0             0
Ethernet96   MC17               0                0            0             0
Ethernet96   MC18               0                0            0             0
Ethernet96   MC19               0                0            0             0

admin@str2-7050cx3-acs-01:~$ show queue counters Ethernet120
       Port    TxQ    Counter/pkts    Counter/bytes    Drop/pkts    Drop/bytes
-----------  -----  --------------  ---------------  -----------  ------------
Ethernet120    UC0           11978          2032524            0             0
Ethernet120    UC1               1              126            0             0
Ethernet120    UC2               0                0            0             0
Ethernet120    UC3               0                0            0             0
Ethernet120    UC4               0                0            0             0
Ethernet120    UC5               0                0            0             0
Ethernet120    UC6               0                0            0             0
Ethernet120    UC7               0                0            0             0
Ethernet120    UC8               0                0            0             0
Ethernet120    UC9               0                0            0             0
Ethernet120   MC10               0                0           26          1664
Ethernet120   MC11               0                0          243         25468
Ethernet120   MC12               0                0            0             0
Ethernet120   MC13               0                0            0             0
Ethernet120   MC14               0                0            0             0
Ethernet120   MC15               0                0            0             0
Ethernet120   MC16               0                0            0             0
Ethernet120   MC17               0                0            0             0
Ethernet120   MC18               0                0            0             0
Ethernet120   MC19               0                0            0             0