Open bofish-arista opened 1 day ago
Hi:
f23185d5 (Rajkumar-Marvell 2024-10-07 11:55:21 +0530 2627) SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST,
that attribute was recently added in 2024/10/7
SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST (on SAI_OBJECT_TYPE_PORT RID oid:0x20100000000) got value oid:0x559ac8beda80 objectTypeQuery returned NULL object type
this attribute is LIST of counters, and it returned some OID value 0x559ac8beda80 on which OT returned NULL, and it should be SAI_OBJECT_TYPE_COUNTER as specified in SAI headers saiport.h, if it's returning NULL, that's a vendor bug
and currently syncd on such error crashes, since we got OID but we don't know what type it is, this is invalid
so what should happened, when we have newer headers in SAI sairedis/syncd and we are using older vendor SAI, all unsupported attributes should return not implemented or not supported, instead of success, but it seams that vendor is returning some invalid value and success on this attribute
error comes from here: https://github.com/sonic-net/sonic-sairedis/blob/master/syncd/SaiDiscovery.cpp#L237
error could be also caused by DASH extensions (if vendor support DASH) since extensions range changed, and that change is not backward compatible: https://github.com/opencomputeproject/SAI/pull/2028
so what i expect is happening, vendor have some custom/private attribute after SAI_PORT_ATTR_END, on older version of SAI headers which have the same enum value as SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST, which causes syncd think that SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST is implemented when actually this is private internal attribute
@bofish-arista can you confirm ?
Description
Sonic-buildimage PR#20540 has incorporated SAI changes which includes latest sairedis, this in turn features changes to SAI, including SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST, which does not appear to be supported yet by Broadcom.
As a result of this change, orchagent is exiting early in startup process.
Relevant links: https://github.com/sonic-net/sonic-buildimage/pull/20540 https://github.com/sonic-net/sonic-sairedis/pull/1431 https://github.com/opencomputeproject/SAI/pull/1941
Steps to reproduce the issue:
Issue is seen during inialization of a standalone device.
Describe the results you received:
Orchagent exited with runtime error logged as shown below:
2024 Nov 4 17:44:07.612509 up500 ERR syncd#syncd: :- run: Runtime error: :- discover: when query SAI_PORT_ATTR_SELECTIVE_COUNTER_LIST (on SAI_OBJECT_TYPE_PORT RID oid:0x100000001) got value oid:0x7ffc4a7b45f0 objectTypeQuery returned NULL object type
Describe the results you expected:
Output of
show version
:root@up322:~# show version
SONiC Software Version: SONiC.branch.master-ars.7cd2518e-buildimage.origin.master-nightly-slim-2024.10.31.20.12 SONiC OS Version: 12 Distribution: Debian 12.7 Kernel: 6.1.0-22-2-amd64 Build commit: 7f44814d7 Build date: Fri Nov 1 00:59:18 UTC 2024 Built by: jenkins@jenkins-arsonic-k8s-1-vfqtx
Platform: x86_64-arista_7060_cx32s HwSKU: Arista-7060CX-32S-C32 ASIC: broadcom ASIC Count: 1 Serial Number: SGD20254417 Model Number: DCS-7060CX-32S Hardware Revision: 03.00 Uptime: 06:20:23 up 9 min, 2 users, load average: 0.26, 0.98, 0.75 Date: Sat 02 Nov 2024 06:20:23
Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):