sonic-net / sonic-mgmt

Configuration management examples for SONiC
Other
194 stars 714 forks source link

[voq] route/test_route_flap failed in grep inband port in the output of 'show ip int' #10811

Closed ysmanman closed 2 months ago

ysmanman commented 10 months ago

Description

We noticed the following failure with T2 202205 sonic-mgmt testing:

"message": "RunAnsibleModuleFail: run module shell failed, Ansible Results => {"changed": true, "cmd": "show ip int | grep -w Ethernet-IB0", "delta": "0:00:01.762650", "end": "2023-11-15 11:50:33.686520", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2023-11-15 11:50:31.923870", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}",
    "messagetext": "duthosts = [<MultiAsicSonicHost cmp314-3>, <MultiAsicSonicHost cmp314-4>, <MultiAsicSonicHost cmp314>]                                                                                                                  
tbinfo = {'comment': 'Tests Arista Arista-7804R3-FM', 'conf-name': 'ardut', 'duts': ['cmp314-3', 'cmp314-4', 'cmp314'], 'duts_map': {'cmp314': 2, 'cmp314-3': 0, 'cmp314-4': 1}, ...}
ptfhost = <tests.common.devices.ptf.PTFHost object at 0x7fea36794850>                                                                                                                                                                       
ptfadapter = <tests.common.plugins.ptfadapter.ptfadapter.PtfTestAdapter testMethod=runTest>
get_function_conpleteness_level = None, announce_default_routes = None                                                
enum_rand_one_per_hwsku_frontend_hostname = 'cmp314-3'                                                                
enum_rand_one_frontend_asic_index = 1                                                                                 

    def test_route_flap(duthosts, tbinfo, ptfhost, ptfadapter,                                                        
                        get_function_conpleteness_level, announce_default_routes,              
                        enum_rand_one_per_hwsku_frontend_hostname, enum_rand_one_frontend_asic_index):
        ptf_ip = tbinfo['ptf_ip']                                                                                     
        common_config = tbinfo['topo']['properties']['configuration_properties'].get('common', {})                    
        nexthop = common_config.get('nhipv4', NHIPV4)                                                                 
        duthost = duthosts[enum_rand_one_per_hwsku_frontend_hostname]                                                                                                                                                                       
        asichost = duthost.asic_instance(enum_rand_one_frontend_asic_index)                                                                                                                                                                 
        # On dual-tor, unicast upstream l3 packet destination mac should be vlan mac                                  
        # After routing, output packet source mac will be replaced with port-channel mac (same as dut_mac)
        # On dual-tor, vlan mac is different with dut_mac. U0/L0 use same vlan mac for AR response                    
        # On single tor, vlan mac (if exists) is same as dut_mac                                        
        dut_mac = duthost.facts['router_mac']                                                                         
        vlan_mac = ""                                                                                                 
        if is_dualtor(tbinfo):                        
            # Just let it crash if missing vlan configs on dual-tor                                                                                                                                                                         
            vlan_cfgs = tbinfo['topo']['properties']['topology']['DUT']['vlan_configs']                               

            if vlan_cfgs and 'default_vlan_config' in vlan_cfgs:                                                                                                                             
                default_vlan_name = vlan_cfgs['default_vlan_config']                                                  
                if default_vlan_name:                                                                                 
                    for vlan in vlan_cfgs[default_vlan_name].values():                                                
                        if 'mac' in vlan and vlan['mac']:                                                             
                            vlan_mac = vlan['mac']                                                                                                                                                                                          
                            break                                                                                     
            pytest_assert(vlan_mac, 'dual-tor without vlan mac !!!')                                                  
        else:                                                                                                                                                                                                                               
           vlan_mac = dut_mac                                                                                                                                                                                                               

        # get dst_prefix_set and aspath                    
        route_prefix_len = get_route_prefix_len(tbinfo, common_config)                                                                                                                       
        routes = namedtuple('routes', ['route', 'aspath'])                                                            
        filtered_iproute_info = get_filtered_iproute_info(duthost, route_prefix_len)                                                                                                         
        if not filtered_iproute_info:                                                                                 
            pytest.skip("Skip this test for current topo.\                                                            
                        At least 1 multipath route coming from ebgp is needed!")                                      

        dst_prefix_set = set()                             
        for route_prefix in filtered_iproute_info:
            # multi-asics can have more than 1 routes in iproute_info[route_prefix], even single-asics have only 1                                                                                                                          
            for route_per_prefix in filtered_iproute_info[route_prefix]:                                          
                out = route_per_prefix.get('path').split(' ')           
                aspath = out[1:]                                                                                      
                entry = routes(route_prefix, ' '.join(aspath))                                                        
                dst_prefix_set.add(entry)                                                                             
        pytest_assert(dst_prefix_set, "dst_prefix_set is empty")                                                                                                                             

>       dev_port, route_to_ping = get_dev_port_and_route(duthost, asichost, dst_prefix_set)                                                                                                  

announce_default_routes = None                                                                
asichost   = <SonicAsic 1>                                         
aspath     = ['64754', '65900']                                    
common_config = {'dut_asn': 65100, 'dut_type': 'SpineRouter', 'max_tor_subnet_number': 32, 'nhipv4': '10.10.246.254', ...}                                                                                                                                                                                                                                                                 
dst_prefix_set = set([routes(route=u'192.171.0.0/25', aspath=u'64603 65900'), routes(route=u'192.171.0.128/25', aspath=u'64603 65900'),...es(route=u'192.171.112.0/25', aspath=u'64603 65900'), routes(route=u'192.171.112.128/25', aspath=u'64603 65900'), ...])             
dut_mac    = '94:8e:d3:5e:8b:d6'                                                                                                                                                                                                                                              
duthost    = <MultiAsicSonicHost cmp314-3>                         
duthosts   = [<MultiAsicSonicHost cmp314-3>, <MultiAsicSonicHost cmp314-4>, <MultiAsicSonicHost cmp314>]                                                                                     
entry      = routes(route=u'193.66.195.128/25', aspath=u'64754 65900')                                                                 
enum_rand_one_frontend_asic_index = 1                                                                                                  
enum_rand_one_per_hwsku_frontend_hostname = 'cmp314-3'                        
filtered_iproute_info = {'192.171.0.0/25': [{'multipath': True, 'network': '192.171.0.0/25', 'nexthops': [{'afi': 'ipv4', 'ip': '10.0.0.5', 'u...twork': '192.171.1.128/25', 'nexthops': [{'afi': 'ipv4', 'ip': '10.0.0.1', 'used': True}], 'origin': 'IGP', ...}], ...}                                                                                                                   
get_function_conpleteness_level = None                                                                                                                                                                                                                                                                                     
nexthop    = '10.10.246.254'                                                  
out        = ['65200', '64754', '65900']                                      
ptf_ip     = '10.244.160.16'                                                  
ptfadapter = <tests.common.plugins.ptfadapter.ptfadapter.PtfTestAdapter testMethod=runTest>                                                                                                  
ptfhost    = <tests.common.devices.ptf.PTFHost object at 0x7fea36794850>                                                                                     
route_per_prefix = {'bestpath': True, 'network': '193.66.195.128/25', 'nexthops': [{'afi': 'ipv4', 'ip': '10.0.0.11', 'used': True}], 'origin': 'IGP', ...}                                                                                                                                                                                                                                
route_prefix = '193.66.195.128/25'                                                                                                                                                                                                                                                                                         
route_prefix_len = 25                                                         
routes     = <class 'tests.route.test_route_flap.routes'>                     
tbinfo     = {'comment': 'Tests Arista Arista-7804R3-FM', 'conf-name': 'ardut', 'duts': ['cmp314-3', 'cmp314-4', 'cmp314'], 'duts_map': {'cmp314': 2, 'cmp314-3': 0, 'cmp314-4': 1}, ...}                                                                                                                                                                                                  
vlan_mac   = '94:8e:d3:5e:8b:d6'                                                              

route/test_route_flap.py:316:                                                                 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _                                                                                                              
route/test_route_flap.py:247: in get_dev_port_and_route                                       
    neigh = duthost.shell("show ip int | grep -w {}".format(port))['stdout']                                                                                                                 
common/devices/multi_asic.py:122: in _run_on_asics                                            
    return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)                                                                                                       
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _                                                                                                              

self = <SonicHost cmp314-3>                                                                   
module_args = ('show ip int | grep -w Ethernet-IB0',), complex_args = {}                                                                                                                     
previous_frame = <frame object at 0x7fea30001710>                                             
filename = '/data/tests/common/devices/multi_asic.py', line_number = 122                                                                                                                     
function_name = '_run_on_asics'                                                               
lines = ['            return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)\n']                                                                                 
index = 0, verbose = True, module_ignore_errors = False, module_async = False                                                                                                                

    def _run(self, *module_args, **complex_args):                                                                     

        previous_frame = inspect.currentframe().f_back                                                                
        filename, line_number, function_name, lines, index = inspect.getframeinfo(previous_frame)

        verbose = complex_args.pop('verbose', True)                                                                   

        if verbose:                                                                                                   
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{}, args={}, kwargs={}"\
                .format(filename, function_name, line_number, self.hostname,
                        self.module_name, json.dumps(module_args), json.dumps(complex_args)))
        else:                                                                                                         
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} executing..."\
                .format(filename, function_name, line_number, self.hostname, self.module_name))

        module_ignore_errors = complex_args.pop('module_ignore_errors', False)
        module_async = complex_args.pop('module_async', False)

        if module_async:                                                                                              
            def run_module(module_args, complex_args):                                                                
                return self.module(*module_args, **complex_args)[self.hostname]
            pool = ThreadPool()                                                                                       
            result = pool.apply_async(run_module, (module_args, complex_args))
            return pool, result                                                                                       

        res = self.module(*module_args, **complex_args)[self.hostname]                                                                                       

        if verbose:                                                                                                                                          
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} Result => {}"\                                                                                  
                .format(filename, function_name, line_number, self.hostname, self.module_name, json.dumps(res)))                                             
        else:                                                                                                                                                
            logging.debug("{}::{}#{}: [{}] AnsibleModule::{} done, is_failed={}, rc={}"\                                                                     
                .format(filename, function_name, line_number, self.hostname, self.module_name, \                                                             
                        res.is_failed, res.get('rc', None)))                                                                                                 

        if (res.is_failed or 'exception' in res) and not module_ignore_errors:                                                                               
>           raise RunAnsibleModuleFail("run module {} failed".format(self.module_name), res)                                                                 
E           RunAnsibleModuleFail: run module shell failed, Ansible Results =>                                                                                
E           {"changed": true, "cmd": "show ip int | grep -w Ethernet-IB0", "delta": "0:00:01.762650", "end": "2023-11-15 11:50:33.686520", "failed": true, "msg": "non-zero return code", "rc": 1, "start": "2023-11-15 11:50:31.923870", "stderr": "", "stderr_lines": [], "stdout": "", "stdout_lines": []}

complex_args = {}                                                                                                                                            
filename   = '/data/tests/common/devices/multi_asic.py'                                                                                                      
function_name = '_run_on_asics'                                                                                                                              
index      = 0                                                                                                                                               
line_number = 122                                                                                                                                                                                                                           
lines      = ['            return getattr(self.sonichost, self.multi_asic_attr)(*module_args, **complex_args)\n']                                                                                                                           
module_args = ('show ip int | grep -w Ethernet-IB0',)                                                                                                                                                                                       
module_async = False                                                                                                                                                                                                                        
module_ignore_errors = False                                                                                                                                                                                                                
previous_frame = <frame object at 0x7fea30001710>                                                                                                                                                                                           
res        = {'stderr_lines': [], u'changed': True, u'end': u'2023-11-15 11:50:33.686520', ...: [], u'start': u'2023-11-15 11:50:31.923870', u'msg': u'non-zero return code'}
self       = <SonicHost cmp314-3>                                                                                                                                                                                                           
verbose    = True                                                                                                                                                              

Steps to reproduce the issue:

  1. Run route/test_route_flap on T2 testbed with 202205 image.

Describe the results you received:

Describe the results you expected:

Additional information you deem important:

**Output of `show version`:**

```
(paste your output here)
```

**Attach debug file `sudo generate_dump`:**

```
(paste your output here)
```
ysmanman commented 10 months ago

Add @arlakshm @kenneth-arista @wenyiz2021 for visibility.

wenyiz2021 commented 10 months ago

hi @ysmanman could you also share what is internal_intfs returned by get_internal_interfaces in this error case? I'm supposing Ethernet-IB0 should be filtered out already by: https://github.com/wenyiz2021/wenyi-mgmt/blob/78d34607deca429ce9201ea261007e5342b621ec/tests/common/devices/multi_asic.py#L665

"VOQ_INBAND_INTERFACE": {
        "Ethernet-IB0": {
            "inband_type": "port"
        },
        "Ethernet-IB0|x": {},
        "Ethernet-IB0|x": {}
kenneth-arista commented 10 months ago

Referencing the PR that potentially introduced this issue: https://github.com/sonic-net/sonic-mgmt/pull/10206

wenyiz2021 commented 2 months ago

close as PR merged