Closed vadymhlushko-mlnx closed 3 years ago
sudo generate_dump
sonic_dump_arc-switch1004_20201124_130354.tar.gz
Hi @zhenggen-xu , @praveen-li - Could you please look into this issue.
Thanks.
@zhenggen-xu @samaity : I recall aliases are mentioned in mlnx platform.json as displayed. Do we need to do anything here or platform.json needs the changes.
@vadymhlushko-mlnx Please include the platform.json of the platform , and also specify your expectations.
@vadymhlushko-mlnx Please include the platform.json of the platform , and also specify your expectations.
I am not sure what is the issue and what is expected from you here ?
When you break out the port to 2 you get eth2a and eth2c - sounds good
Then you in unbreak out the port and you expect it to be the same as it was before you break out . But if I understand correct the output example provided I can see eth2a instead of eth2 @Vadym Hlushkomailto:vadymh@nvidia.com can u confirm?
Get Outlook for iOShttps://aka.ms/o0ukef
From: zhenggen-xu notifications@github.com Sent: Wednesday, January 6, 2021 10:58:08 PM To: Azure/sonic-buildimage sonic-buildimage@noreply.github.com Cc: Liat Grozovik liatg@nvidia.com; Comment comment@noreply.github.com Subject: Re: [Azure/sonic-buildimage] [DPB] wrong aliases for interfaces (#6024)
I am not sure what is the issue and what is expected from you here ?
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fsonic-buildimage%2Fissues%2F6024%23issuecomment-755684838&data=04%7C01%7Cliatg%40nvidia.com%7Cfc1758ae2ee44efb8e9f08d8b285c81d%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637455634943900334%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Ho8V0kksPz5%2FJ8xVa%2FYz5ap%2B63ZYJPOW1QmyR07EIys%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKTABA6CIXTJXS3XFJCS3STSYTFGBANCNFSM4UA6ZGUA&data=04%7C01%7Cliatg%40nvidia.com%7Cfc1758ae2ee44efb8e9f08d8b285c81d%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637455634943905314%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=Pe1yxUs5ao4K5I8VimgR%2BNoBVfhmF8evF1YarwCHkVQ%3D&reserved=0.
OK, if that was the case, then it was as designed. we will only use the aliases defined in platform.json for breaking out to any mode (include the one port mode in the "unbreak out case"). I guess the question was, where did the original alias eth2 coming from?
If we want to support different aliases for different breakout mode, we need change the platform.json format and parsing logic for this new requirement.
It is strange that before and after have to different aliases But I would suggest waiting for Vadym to clarify if my understanding is correct or not
Get Outlook for iOShttps://aka.ms/o0ukef
From: zhenggen-xu notifications@github.com Sent: Thursday, January 7, 2021 6:46:19 AM To: Azure/sonic-buildimage sonic-buildimage@noreply.github.com Cc: Liat Grozovik liatg@nvidia.com; Comment comment@noreply.github.com Subject: Re: [Azure/sonic-buildimage] [DPB] wrong aliases for interfaces (#6024)
OK, if that was the case, then it was as designed. we will only use the aliases defined in platform.json for breaking out to any mode (include the one port mode in the "unbreak out case"). I guess the question was, where did the original alias eth2 coming from?
If we want to support different aliases for different breakout mode, we need change the platform.json format and parsing logic for this new requirement.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fsonic-buildimage%2Fissues%2F6024%23issuecomment-755880892&data=04%7C01%7Cliatg%40nvidia.com%7Cf8c264c952844cd1d3ea08d8b2c733c8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637455915935469798%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=VPvG0kL%2FtYSCX%2BtFEpQ%2Fade2E7lF1x41RpHLIg4HWcQ%3D&reserved=0, or unsubscribehttps://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAKTABAYX2I4V72EKUOGK3TDSYU4BXANCNFSM4UA6ZGUA&data=04%7C01%7Cliatg%40nvidia.com%7Cf8c264c952844cd1d3ea08d8b2c733c8%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C637455915935474778%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=OfvTCqv8e0u%2Fyxo9LNoEfrmZLrbSvJ7awHGAHl%2FmkbY%3D&reserved=0.
@zhenggen-xu, @liat-grozovik The problem is:
1 . Before the breakout alias was etp2
:
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- ------ ------ ------- ------ ----------
Ethernet4 4,5,6,7 100G 9100 N/A etp2 routed down up N/A N/A
config interface breakout Ethernet4 2x50G -y
Aliases are etp2a
and etp2c
. The first question is why not etp2a
and etp2b
?root@arc-switch1004:~# show inter sta
Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- ------ ------ ------- ------ ----------
Ethernet4 4,5 50G 9100 N/A etp2a routed down up N/A N/A
Ethernet6 6,7 50G 9100 N/A etp2c routed down up N/A N/A
config interface breakout Ethernet4 1x100G[40G] -y
Alias is etp2a
, the second question is - why the alias is not etp2
as it was before the breakout (1 step). Interface Lanes Speed MTU FEC Alias Vlan Oper Admin Type Asym PFC
----------- --------------- ------- ----- ----- ------- ------ ------ ------- ------ ----------
Ethernet4 4,5,6,7 100G 9100 N/A etp2a routed down up N/A N/A
@vadymhlushko-mlnx As mentioned, I was not sure where you get the etp2 from? was it inherited from old configuration? If so, this is expected and the old configuration was not compatible with platform.json. And In the DPB design, we only use the aliases from the platform.json. And if we expect to have different alias names for different breakout modes for the same interface say: Ethernet4, we would need add this capability in platform.json and the parsing logic.
@praveen-li, @zhenggen-xu, @samaity
As it turned out recently, this issue with aliases naming is not only a cosmetic problem that you can see when you execute the
show interfaces status
command.
In SONiC we have a flow to generate a /etc/sonic/config_db.json
from /etc/sonic/minigraph.xml
by using the command config load_minigraph
.
/etc/sonic/minigraph.xml
generated by sonic-mgmt script ./testbed-cli.sh gen-mg ${SWITCH}-${TOPOLOGY} lab vault
minigraph.xml
have the aliases for interfaces without 'a' postfix - example etp1
, etp2
.
Somewhere in the middle of config load_minigraph
logic, before applying the /etc/sonic/config_db.json
to the switch, the DPB logic (introduced in PR-3909) change the aliases for interfaces (e.g. etp1
-> etp1a
). After that, we have /etc/sonic/config_db.json
with misconfiguration, part of this file has etp1
(without a
postfix) aliases, the other part has an a
postfix for aliases - etp1a
.
After that, show interfaces status
have the next output
root@arc-switch1038:/home/admin# show interfaces status
Traceback (most recent call last):
File "/usr/local/bin/intfutil", line 521, in <module>
main()
File "/usr/local/bin/intfutil", line 513, in main
interface_stat.display_intf_status()
File "/usr/local/bin/intfutil", line 354, in display_intf_status
self.get_intf_status()
File "/usr/local/lib/python3.7/dist-packages/utilities_common/multi_asic.py", line 137, in wrapped_run_on_all_asics
func(self, *args, **kwargs)
File "/usr/local/bin/intfutil", line 435, in get_intf_status
self.portchannel_speed_dict = po_speed_dict(self.po_int_dict, self.db)
File "/usr/local/bin/intfutil", line 249, in po_speed_dict
interface_speed = '{}G'.format(interface_speed[:-3])
TypeError: 'NoneType' object is not subscriptable
In the APP_DB
we have misconfiguration:
root@arc-switch1038:/home/admin# redis-cli -n 0 KEYS *PORT_TABLE*
1) "_PORT_TABLE:Ethernet52"
2) "_PORT_TABLE:Ethernet56"
3) "PORT_TABLE:Ethernet40"
4) "PORT_TABLE:Ethernet68"
5) "_PORT_TABLE:Ethernet0"
6) "PORT_TABLE:Ethernet84"
7) "PORT_TABLE:Ethernet24"
8) "_PORT_TABLE:Ethernet44"
9) "PORT_TABLE:Ethernet120"
10) "PORT_TABLE:Ethernet96"
11) "_PORT_TABLE:Ethernet40"
12) "PORT_TABLE:etp31"
13) "_PORT_TABLE:Ethernet8"
14) "_PORT_TABLE:Ethernet4"
15) "PORT_TABLE:Ethernet12"
16) "PORT_TABLE:Ethernet20"
17) "PORT_TABLE:Ethernet28"
18) "_PORT_TABLE:Ethernet60"
19) "PORT_TABLE:Ethernet112"
20) "PORT_TABLE:Ethernet16"
21) "PORT_TABLE:etp29"
22) "PORT_TABLE:Ethernet4"
23) "PORT_TABLE:Ethernet76"
24) "PORT_TABLE:Ethernet92"
25) "PORT_TABLE:Ethernet116"
26) "_PORT_TABLE:Ethernet12"
27) "PORT_TABLE:Ethernet104"
28) "PORT_TABLE:Ethernet88"
29) "_PORT_TABLE:Ethernet48"
30) "PORT_TABLE:Ethernet48"
31) "PORT_TABLE:Ethernet108"
32) "PORT_TABLE:Ethernet124"
33) "_PORT_TABLE:Ethernet28"
34) "PORT_TABLE:Ethernet60"
35) "PORT_TABLE:Ethernet52"
36) "PORT_TABLE:Ethernet44"
37) "PORT_TABLE:Ethernet0"
38) "_PORT_TABLE:Ethernet32"
39) "_PORT_TABLE:Ethernet64"
40) "PORT_TABLE:Ethernet32"
41) "PORT_TABLE:Ethernet100"
42) "_PORT_TABLE:Ethernet24"
43) "PORT_TABLE:Ethernet80"
44) "PORT_TABLE:Ethernet56"
45) "PORT_TABLE:PortConfigDone"
46) "PORT_TABLE:Ethernet36"
47) "_PORT_TABLE:Ethernet36"
48) "_PORT_TABLE:Ethernet20"
49) "PORT_TABLE:Ethernet8"
50) "PORT_TABLE:etp30"
51) "PORT_TABLE:etp32"
52) "_PORT_TABLE:Ethernet16"
53) "PORT_TABLE:Ethernet72"
54) "PORT_TABLE_KEY_SET"
55) "PORT_TABLE:Ethernet64"
root@arc-switch1038:/home/admin# show version
SONiC Software Version: SONiC.SONIC.master.64-46b3bd5_Internal
Distribution: Debian 10.7
Kernel: 4.19.0-9-2-amd64
Build commit: 46b3bd55
Build date: Sun Jan 24 12:16:00 UTC 2021
Built by: sw-r2d2-bot@r-build-sonic-ci02
Platform: x86_64-mlnx_msn3700-r0
HwSKU: ACS-MSN3700
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1851X02961
Uptime: 15:02:33 up 37 min, 1 user, load average: 5.11, 2.69, 1.70
Docker images:
REPOSITORY TAG IMAGE ID SIZE
docker-syncd-mlnx SONIC.master.64-46b3bd5_Internal 095d22775945 539MB
docker-syncd-mlnx latest 095d22775945 539MB
docker-snmp SONIC.master.64-46b3bd5_Internal 09cab211ff77 435MB
docker-snmp latest 09cab211ff77 435MB
docker-sonic-mgmt-framework SONIC.master.64-46b3bd5_Internal f48021f6d2cc 612MB
docker-sonic-mgmt-framework latest f48021f6d2cc 612MB
docker-nat SONIC.master.64-46b3bd5_Internal e2a9d271e4a8 407MB
docker-nat latest e2a9d271e4a8 407MB
docker-router-advertiser SONIC.master.64-46b3bd5_Internal c3ffe156b7cd 394MB
docker-router-advertiser latest c3ffe156b7cd 394MB
docker-platform-monitor SONIC.master.64-46b3bd5_Internal 70bd029ed33a 684MB
docker-platform-monitor latest 70bd029ed33a 684MB
docker-lldp SONIC.master.64-46b3bd5_Internal 9a625f8b36e9 434MB
docker-lldp latest 9a625f8b36e9 434MB
docker-database SONIC.master.64-46b3bd5_Internal d88f98696c3b 394MB
docker-database latest d88f98696c3b 394MB
docker-orchagent SONIC.master.64-46b3bd5_Internal b20616415289 423MB
docker-orchagent latest b20616415289 423MB
docker-macsec SONIC.master.64-46b3bd5_Internal c26332217a3f 408MB
docker-macsec latest c26332217a3f 408MB
docker-dhcp-relay SONIC.master.64-46b3bd5_Internal e4cd8c21addf 401MB
docker-dhcp-relay latest e4cd8c21addf 401MB
docker-sonic-telemetry SONIC.master.64-46b3bd5_Internal cdfcf055317a 469MB
docker-sonic-telemetry latest cdfcf055317a 469MB
docker-teamd SONIC.master.64-46b3bd5_Internal 35c2ae9ebd57 405MB
docker-teamd latest 35c2ae9ebd57 405MB
docker-fpm-frr SONIC.master.64-46b3bd5_Internal 819524ded687 422MB
docker-fpm-frr latest 819524ded687 422MB
docker-sflow SONIC.master.64-46b3bd5_Internal f83e8d6069ab 406MB
docker-sflow latest f83e8d6069ab 406MB
show techsupport
sonic_dump_arc-switch1038_20210126_150103.tar.gz
+Sangita
to double-check if 3909 is adding suffix 'a'
From: Vadym Hlushko notifications@github.com Sent: Tuesday, January 26, 2021 7:04 AM To: Azure/sonic-buildimage sonic-buildimage@noreply.github.com Cc: Praveen Chaudhary pchaudhary@linkedin.com; Mention mention@noreply.github.com Subject: Re: [Azure/sonic-buildimage] [DPB] wrong aliases for interfaces (#6024)
show techsupport sonic_dump_arc-switch1038_20210126_150103.tar.gzhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fsonic-buildimage%2Ffiles%2F5874116%2Fsonic_dump_arc-switch1038_20210126_150103.tar.gz&data=04%7C01%7Cpchaudhary%40linkedin.com%7Cdc35503b0eb241db48a908d8c20bb585%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637472702838250463%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=EmIVuaECsf5ctbNNsebZWv2HOGiF6YVjV43PUbxDcDo%3D&reserved=0
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FAzure%2Fsonic-buildimage%2Fissues%2F6024%23issuecomment-767604087&data=04%7C01%7Cpchaudhary%40linkedin.com%7Cdc35503b0eb241db48a908d8c20bb585%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637472702838260419%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=TGkPGXHF5p%2B34y9377Opoysbji5Y9pamAUzl0eLu2u8%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJHZHPNNKRYVQMGKS6KHZXLS33KYNANCNFSM4UA6ZGUA&data=04%7C01%7Cpchaudhary%40linkedin.com%7Cdc35503b0eb241db48a908d8c20bb585%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637472702838260419%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=qDKKKOw7c47kVRtORXR9%2B%2FcqhzTQhB8jejridqMe%2Bqk%3D&reserved=0.
I just checked the code. It is not adding any particular suffix to the alias. We only use the aliases from the platform.json in the logic.
Something in the flow when using the command of mini graph and creating the config is doing so I think you should go over the exact flow described in details and look in the dpb logic what causes it as it is causing mismatch and non functional system
Get Outlook for iOShttps://aka.ms/o0ukef
From: Sangita Maity notifications@github.com Sent: Tuesday, January 26, 2021 11:41:06 PM To: Azure/sonic-buildimage sonic-buildimage@noreply.github.com Cc: Liat Grozovik liatg@nvidia.com; Mention mention@noreply.github.com Subject: Re: [Azure/sonic-buildimage] [DPB] wrong aliases for interfaces (#6024)
I just checked the code. It is not adding any suffix 'a' to the alias. We only use the aliases from the platform.json in the logic.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/Azure/sonic-buildimage/issues/6024#issuecomment-767845637, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AKTABAZLDRJJ6LZIZMALVPLS34ZHFANCNFSM4UA6ZGUA.
@praveen-li, @samaity, @zhenggen-xu
I'm sorry, maybe the previous comment was not clear, let me clarify about the postfix a
(e.g etp1a
):
Aliases like etp1a
, etp2a
are taken from the platfrom.json file and goes to the /etc/sonic/config_db.json
while the user doing the command config load_minigraph
For example, if we change the "alias_at_lanes"
field for every interface (e.g. Ethernet0
, Ethernet8
) in platform.json file:
from
"alias_at_lanes": "etp1a, etp1b, etp1c, etp1d, etp1e, etp1f, etp1g, etp1h",
to
"alias_at_lanes": "etp1, etp1b, etp1c, etp1d, etp1e, etp1f, etp1g, etp1h",
And then will do the config load_minigraph
command - everything will be okay and the switch will have correct /etc/sonic/config_db.json
.
@vadymhlushko-mlnx Still not sure what the problem is. Do you want etp1
for Ethernet0
all the time? If so, then configure that way in platform.json. Or do you want the alias to be etp1
for Ethernet0
in case of 100G but etp1a
for Ethernet0
in case of breakout? If that is the case, the current DPB does not support that, we have to enhance the design and also implement it.
Again, if you have etp1a
in the platform.json, we will always generate etp1a
to config_db.json, so not sure where that etp1
came from. If it was from old configurations (generated outside or inherited by upgrade or whatever), then that configuration was not compatible with the current platform.json, and you have to start with a compatible configuration from beginning.
etp1 is for 100G ethernet0 etp1a is for 50G ethernet0 or 25G ethernet0
need to enhance that.
etp1 is for 100G ethernet0 etp1a is for 50G ethernet0 or 25G ethernet0
need to enhance that.
@vadymhlushko-mlnx If this is the case, for simplicity reasons, I would suggest we introduce one more field in platform.json, like "alias_root", which can specify the alias for the root port when we don't breakout (I,E single port) for that port group. e,g, we use etp1 for Ethernet0. and we can incorporate the changes in the parsing logic to use that alias. Changes need to be made in https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-config-engine/portconfig.py
If we even want different aliases for different breakout modes, then it could be more lengthy where we need alias list for each breakout mode.
etp1 is for 100G ethernet0 etp1a is for 50G ethernet0 or 25G ethernet0 need to enhance that.
@vadymhlushko-mlnx If this is the case, for simplicity reasons, I would suggest we introduce one more field in platform.json, like "alias_root", which can specify the alias for the root port when we don't breakout (I,E single port) for that port group. e,g, we use etp1 for Ethernet0. and we can incorporate the changes in the parsing logic to use that alias. Changes need to be made in https://github.com/Azure/sonic-buildimage/blob/master/src/sonic-config-engine/portconfig.py
If we even want different aliases for different breakout modes, then it could be more lengthy where we need alias list for each breakout mode.
Additional field the alias_root
- sounds great. Thanks!
Additional field the
alias_root
- sounds great. Thanks!
would you implement this? and we will review it.
After the last DPB subgroup meeting with the example config_db.json. It was noted that some of the tables like PORT_CHANNEL table member has invalid name. After checking the code, it turned out to be in the flow from minigraph to configDB translation. Minigraph takes the alias as input, in this case etp1
, and trying to translate to PORT_NAME into config_db.json, since platform.json do not have etp1
as alias, the translation failed, thus left the alias in config_db.json and cause backend to fail.
To really support flexible aliases for different vendors and also for different breakout modes, I raised a PR for the design change: https://github.com/Azure/SONiC/pull/749 , with this new format, we need change some logic in portconfig.py. @vadymhlushko-mlnx Let me know if you guys wanted to go ahead implement that logic in portconfig.py. Thanks!
@samaity Will work on the changes.
Description Interface has a wrong alias after successful dynamic port breakout command
Steps to reproduce the issue:
show interfaces status
config interface breakout Ethernet4 2x50G
show interfaces status
Describe the results you received:
Describe the results you expected:
Additional information you deem important (e.g. issue happens only occasionally):