sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
742 stars 1.43k forks source link

[Nokia-BRCM-DNX]: CLI show dropcounter counts - retains the stats after clearing #19861

Open amitpawar12 opened 3 months ago

amitpawar12 commented 3 months ago

Description

Although the device says that drop counter IS NOT supported on this device, I see that drop counters do increment.

admin@ixre-egl-board71:~$ show dropcounters capabilities
Current device does not support drop counters

After clearing the drop counter stats for a given ASIC, if the stats of other ASIC is cleared, the stats for the first ASIC are brought back.

Steps to reproduce the issue:

  1. Find the drop counter stats for a given ASIC. These drops are brought about by oversubscribing lossy priority traffic.
    admin@ixre-egl-board71:~$ date;sudo ip netns exec asic0 show dropcounters counts | grep -E "IFACE|Ethernet0 |Ethernet8 "
    Thu 08 Aug 2024 07:30:36 PM UTC
        IFACE    STATE    RX_ERR     RX_DROPS    TX_ERR    TX_DROPS
    Ethernet0        U         0  13707840988         0           0
    Ethernet8        U         0  13703485502         0           0
  2. Clear the drop counter stats for ASIC0 and check the stats.
    
    admin@ixre-egl-board71:~$ date;sudo ip netns exec asic0 sonic-clear dropcounters
    Thu 08 Aug 2024 07:30:53 PM UTC
    Cleared drop counters

admin@ixre-egl-board71:~$ date;sudo ip netns exec asic0 show dropcounters counts | grep -E "IFACE|Ethernet0 |Ethernet8 " Thu 08 Aug 2024 07:30:56 PM UTC IFACE STATE RX_ERR RX_DROPS TX_ERR TX_DROPS Ethernet0 U 0 0 0 0 Ethernet8 U 0 0 0 0

3.  Clear the stats for ASIC1

admin@ixre-egl-board71:~$ date;sudo ip netns exec asic1 sonic-clear dropcounters Thu 08 Aug 2024 07:31:05 PM UTC Cleared drop counters

4. Check the stats for ASIC0

admin@ixre-egl-board71:~$ date;sudo ip netns exec asic1 sonic-clear dropcounters Thu 08 Aug 2024 07:31:31 PM UTC Cleared drop counters

admin@ixre-egl-board71:~$ date;sudo ip netns exec asic0 show dropcounters counts | grep -E "IFACE|Ethernet0 |Ethernet8 " Thu 08 Aug 2024 07:31:59 PM UTC IFACE STATE RX_ERR RX_DROPS TX_ERR TX_DROPS Ethernet0 U 0 13707840988 0 0 Ethernet8 U 0 13703485502 0 0

5. The drop counter stats are identical and there was no traffic flowing through the box.
6. The same is also true if the ASICs are swapped.

#### Describe the results you received:
On clearing the stats for ASIC0, and checking the stats, the stats reset. However, if we clear the stats for other ASIC, the stats on first ASIC come back again.

#### Describe the results you expected:
Clearing of the stats on one ASIC, should not have any impact on the stats on different ASIC.

#### Output of `show version`:

(paste your output here)

#### Output of `show techsupport`:

(paste your output here or download and attach the file here )



#### Additional information you deem important (e.g. issue happens only occasionally):

<!--
     Also attach debug file produced by `sudo generate_dump`
-->
zjswhhh commented 3 months ago

@saksarav-nokia - can you please triage internally first? Thanks!

saksarav-nokia commented 3 months ago

It is the same root cause as https://github.com/sonic-net/sonic-buildimage/issues/19779

This seems to be very fundamental issue when the user issues sonic-clear and then show for the commands in multi-asic switch. If you issue sonic-clear for asic0, then asic1 and then do show for asic0 OR son-clear for asic1, sonic-clear for asic0 and then do show for asic1, you will see this issue. The reason is that the soni-clear command creates the files in /tmp/cache/dropstat each time you run clear command and overwrites the asic 0's, when you clear for asic 0 first and then asic 0 (or asic1 file depending on the order you issue sonic-clear). Then when you do show for asic0, it reads this file created for asic 1 and it does look up for asic 0's port counter( which does n't exist) and subtract this from counter vlaue from COUNTERS_DB. So it prints the value from COUNTER_DB. Since this is very basic issue and applies for all the clear commands in multi-asic , need to discuss with sonic community for the fix.

saksarav-nokia commented 3 months ago

If we add the multi-asic support to sonic-clear command for pg drop counter, then the cache file can be created with prefixing ns and show command can read the corresponding history from cache.

vmittal-msft commented 3 months ago

@kenneth-arista can you/team please check this ? This is related to multi asic support for Qos commands.

kenneth-arista commented 3 months ago

Copying my comments issue 19779:

The solution is to not use ip netns exec before running CLI command related to "priority-group drop counters" because native multi-ASIC support has been added recently to these family of commands. Instead use the built-in -n argument. See https://github.com/sonic-net/sonic-utilities/pull/3058 for further details.

The reason is that ip netns exec ... limits the Linux network namespace, which conflicts with the default use of the multi_asic decorator for adding multi-asic support to existing commands. I believe historically folks have been using ip netns exec as a hack to get around old commands that haven't been taught about multi-asic. But we are putting effort into enhancing all Qos commands to natively support multi-asic. Tracking issue: https://github.com/sonic-net/sonic-buildimage/issues/15148.

amitpawar12 commented 3 months ago

@kenneth-arista:

Does the drop-counter CLI support namespace tag or not?

I went ahead and tried to check if drop counter have namespace tag. I did not find it. Checked with 2205 build.

=======================================
-- Clearing the counters
=======================================
admin@ixre-egl-board71:~$ sudo sonic-clear dropcounters -h
Usage: sonic-clear dropcounters [OPTIONS]

  Clear drop counters

Options:
  -h, -?, --help  Show this message and exit.

admin@ixre-egl-board71:~$ sonic-clear dropcounters -h
Usage: sonic-clear dropcounters [OPTIONS]

  Clear drop counters

Options:
  -h, -?, --help  Show this message and exit.

=======================================
-- Check the counters in sudo mode.
=======================================   
admin@ixre-egl-board71:~$ sudo show dropcounters counts -h
Usage: show dropcounters counts [OPTIONS]

  Show drop counts

Options:
  -g, --group TEXT
  -t, --counter_type TEXT
  --verbose                Enable verbose output
  -h, -?, --help           Show this message and exit.
admin@ixre-egl-board71:~$ sudo show dropcounters counts -n asic0
Usage: show dropcounters counts [OPTIONS]
Try "show dropcounters counts -h" for help.

Error: no such option: -n
admin@ixre-egl-board71:~$ sudo show dropcounters -n asic0 counts
Usage: show dropcounters [OPTIONS] COMMAND [ARGS]...
Try "show dropcounters -h" for help.

Error: no such option: -n

=======================================
-- CLI in non-sudo mode.
=======================================
admin@ixre-egl-board71:~$ show dropcounters counts --namespace asic0
Usage: show dropcounters counts [OPTIONS]
Try "show dropcounters counts -h" for help.

Error: no such option: --namespace

Please let me know.

Thanks, -A

kenneth-arista commented 3 months ago

The changes to support multi-asics merged to master and not 202205. 202205 is virtually frozen and not accepting any feature changes.

kenneth-arista commented 3 months ago

Please close this issue as I don't have permission to do so.

saksarav-nokia commented 1 week ago

@kenneth-arista , Did Arista add namespace support for sonic-clear commands also. How do you clear the counters?. I don't see -n support in 202405 image