sonic-net / sonic-buildimage

Scripts which perform an installable binary image build for SONiC
Other
741 stars 1.44k forks source link

RPC builds are broken #20322

Open jon-nokia opened 2 months ago

jon-nokia commented 2 months ago

Description

https://github.com/sonic-net/sonic-sairedis/pull/1409 contained in latest HEAD is causing RPC build failure.

make ENABLE_SYNCD_RPC=y target/docker-syncd-brcm-dnx-rpc.gz

mkdir -p ./src/obj g++ -I/usr/include/sai -I. -I../../experimental -std=c++11 -DFORCE_BOOST_SMART_PTR -c src/switch_sai_rpc_server.cpp -o src/obj/switch_sai_rpc_server.o -I/usr/include/sai -I. -I../../experimental -std=c++11 -DFORCE_BOOST_SMART_PTR -I./src/gen-cpp In file included from ../../experimental/saiexperimentaldashvip.h:30, from /usr/include/sai/saiobject.h:40, from /usr/include/sai/sai.h:50, from src/switch_sai_rpc_server.cpp:48: ../../experimental/saitypesextensions.h:37:46: error: 'SAI_OBJECT_TYPE_EXTENSIONS_RANGE_BASE' was not declared in this scope 37 | SAI_OBJECT_TYPE_EXTENSIONS_RANGE_START = SAI_OBJECT_TYPE_EXTENSIONS_RANGE_BASE,

Steps to reproduce the issue:

1. 2. 3.

Describe the results you received:

Describe the results you expected:

Output of show version:

(paste your output here)

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

saksarav-nokia commented 2 months ago

@kcudnik for viz. I think your PR https://github.com/opencomputeproject/SAI/pull/2028 is causing this. We are trying to build RPC image for broadcom dnx platform with SAI 11.2.7.1

bingwang-ms commented 1 month ago

Issue is seen in master branch when building syncd with rpc enabled. @kcudnik @liushilongbuaa Can you check if buildteam can help address this issue?

liushilongbuaa commented 1 month ago

@theasianpianist , please also help to check.

arlakshm commented 1 month ago

@saksarav-nokia, @jon-nokia Is this issue seen in 202405?

jon-nokia commented 1 month ago

@saksarav-nokia, @jon-nokia Is this issue seen in 202405?

No. I believe 202405 src/sonic-sairedis is still at d3fbec017ef, so has not had update since 5/27. src/sonic-sairedis commit, 433d039a4 brings in new SAI submodule which causes break.

kcudnik commented 1 month ago

not sure how this is cauing RPC error, seems like your headers are not updated, in my change you pointed https://github.com/opencomputeproject/SAI/pull/2028/files#diff-444febffce442e4277a1ced22d1ab94df045dfaa10a1b671ee35a7e5c2d9f683R305 this line implements SAI_OBJECT_TYPE_EXTENSIONS_RANGE_BASE, SAI headers build is passing on girhub and locally, also this is passing on sairedis repo where build is also passing, not sure how your build is including headers that are not including correct saitypes.h

kcudnik commented 1 month ago

can i reporoduce this locally without building entire image ?

saksarav-nokia commented 1 month ago

@kcudnik , i believe the libsai package installs the sai header files in /usr/include/sai and sai-thrift includes these header files for build. Since the vendor SAI header files don't have your changes, the build is failing.

kcudnik commented 1 month ago

so seems like vendor headers needs to be updated, or we need to update rpc build to use local headers, since originally rpc build code is not my code, if i remember correctly they copy /usr/include/sai headers to build rpc instead of local SAI repo headers, thats why it could fail

any suggestions which way we should go ?

kcudnik commented 1 month ago

problem is here: https://github.com/opencomputeproject/SAI/blob/master/test/saithrift/Makefile#L6 sai headers are taken form /usr/include/sai which are vendor headers, and experimental headers are taken from -I../../experimental as local headers to compile

but im not sure changing

SAI_PREFIX = /usr
SAI_HEADER_DIR ?= $(SAI_PREFIX)/include/sai
SAI_HEADERS = $(SAI_HEADER_DIR)/sai*.h
CFLAGS = -I$(SAI_HEADER_DIR) -I. -I../../experimental -std=c++11

will cause some other issues, since SAI_HEADER_DIR is with "?" which means i cloud be override externally before calling makefile, im not sure if DASH is utilizing this feature.

not sure what would be easy fix here, where should we patch +@lguohan

this was added here: https://github.com/opencomputeproject/SAI/commit/410a5bd5ca883d51a332bb88efaaaaa13f8db56c for that specific reason

saksarav-nokia commented 1 month ago

@kcudnik , I am able to build RPC image for BCM DNX if i change the sai-thrift Makefile to use the experimental header files from /usr/include/sai/experimental. Not sure if all SAI vendors include experimental header files in the libsai package.

kcudnik commented 1 month ago

@kcudnik , I am able to build RPC image for BCM DNX if i change the sai-thrift Makefile to use the experimental header files from /usr/include/sai/experimental. Not sure if all SAI vendors include experimental header files in the libsai package.

im not sure that ether, but my proposal was to actually point to SAI/inc since experimental is pointed in

CFLAGS = -I$(SAI_HEADER_DIR) -I. -I../../experimental -std=c++11

and the issue is with regular includes that are pulled of from /usr/include/sai, so if we would change that SAI_HEADER_DIR to ../inc during build make then it should be fine

saksarav-nokia commented 1 month ago

@kcudnik , I am able to build RPC image for BCM DNX if i change the sai-thrift Makefile to use the experimental header files from /usr/include/sai/experimental. Not sure if all SAI vendors include experimental header files in the libsai package.

im not sure that ether, but my proposal was to actually point to SAI/inc since experimental is pointed in

CFLAGS = -I$(SAI_HEADER_DIR) -I. -I../../experimental -std=c++11

and the issue is with regular includes that are pulled of from /usr/include/sai, so if we would change that SAI_HEADER_DIR to ../inc during build make then it should be fine

Yes. In our local build, we unblocked the build failure by using ../../inc instead of /usr/include/sai

kcudnik commented 1 month ago

great, did you changed that in Makefile in SAI repo or in build process of this package in build image?

saksarav-nokia commented 1 month ago

i changed in sai-thrift Makefile https://github.com/saksarav-nokia/SAI/commit/2cbd247500264a9c85c148597ad1658478106dc3

kcudnik commented 1 month ago

yea, but this will limit functionality, will be be able to change the code that is actually calling this Makefile? and pass SAI_HEADER_DIR=../../inc in command line ?

liushilongbuaa commented 1 month ago

@saksarav-nokia , @kcudnik What's the status now?

kcudnik commented 1 month ago

asked @saksarav-nokia whether he can make change in build https://github.com/sonic-net/sonic-buildimage/issues/20322#issuecomment-2402541011

saksarav-nokia commented 1 month ago

asked @saksarav-nokia whether he can make change in build #20322 (comment)

Will there be any functionality impact. We have tested with only BCM dnx SAI and it is working fine. How about other SAI vendors.

kcudnik commented 1 month ago

its hard to say on other vendors, originally this code was added by mlnx

abdosi commented 4 weeks ago

@kcudnik what should be next steps to fix for master ?

kcudnik commented 4 weeks ago

i proposes solution here: https://github.com/sonic-net/sonic-buildimage/issues/20322#issuecomment-2402541011

liushilongbuaa commented 4 weeks ago

@kcudnik , who can do the code change? @saksarav-nokia ? Maybe we can create a PR for test.

kcudnik commented 4 weeks ago

guys here already mentioned that they made local change and it works ? so can you raise PR ?

liushilongbuaa commented 3 weeks ago

i proposes solution here: #20322 (comment)

@kcudnik , so we are waiting for @saksarav-nokia create PR to nokia repo?

kcudnik commented 2 weeks ago

@saksarav-nokia please create PR if you already have the fix for this

saksarav-nokia commented 2 weeks ago

@saksarav-nokia please create PR if you already have the fix for this

Yes. I will create the PR

saksarav-nokia commented 2 weeks ago

@kcudnik , I have created the PR. Pls review it

kcudnik commented 2 weeks ago

@kcudnik , I have created the PR. Pls review it

Approved, but please provide more explanation in that pr description why this fix is made

saksarav-nokia commented 2 weeks ago

@kcudnik , I have created the PR. Pls review it

Approved, but please provide more explanation in that pr description why this fix is made

@kcudnik , Added more explanation . Pls check

kcudnik commented 2 weeks ago

merged

adyeung commented 1 week ago

@kcudnik when can the Makefile fix from @saksarav-nokia made into SONiC? Latest SONiC broadcom daily build still failing for the same, let me know if there is anything I missed

https://dev.azure.com/mssonic/build/_build/results?buildId=694058&view=logs&j=88ce9a53-729c-5fa9-7b6e-3d98f2488e3f&t=b71397ab-4d88-53c1-4db7-b4fc5ec36f6f

kcudnik commented 1 week ago

@saksarav-nokia have you made fix?

saksarav-nokia commented 1 week ago

Yes, https://github.com/opencomputeproject/SAI/pull/2097

adyeung commented 1 week ago

@kcudnik can someone help pickup the fix for sonic-sairedis? Broadcom daily build on master branch has been failing, we need to fix this before 202411 branch off

adyeung commented 1 week ago

FYI @zhangyanzhao @yuezhoujk @tjchadaga

kcudnik commented 1 week ago

so 2020405 is using SAI v1.14, and that change is on top on v1.15 branch so probably we need to add v1.14.1 version withc that change cherrypicked @tjchadaga @rlhui will that be ok ?

tjchadaga commented 1 week ago

so 2020405 is using SAI v1.14, and that change is on top on v1.15 branch so probably we need to add v1.14.1 version withc that change cherrypicked @tjchadaga @rlhui will that be ok ?

@kcudnik, currently, the build is only broken for master. 202405 which is using v1.14 is building fine. Do we still need this PR cherry-picked to v1.14?

adyeung commented 6 days ago

@kcudnik @tjchadaga @rlhui is there a decision? This is needed in master branch to fix broadcom daily build

adyeung commented 6 days ago

FYI @yuazhe @zhangyanzhao @eddieruan-alibaba 202411 release team

kcudnik commented 6 days ago

@kcudnik @tjchadaga @rlhui is there a decision? This is needed in master branch to fix broadcom daily build

Yes we added submodule update in sairedis and will update master branch here it's not merged yet latest pr in sairedis repo

kcudnik commented 4 days ago

https://github.com/sonic-net/sonic-sairedis/pull/1462 merged updated submodule on sairedis, i will prepare update on buildimage master

kcudnik commented 4 days ago

i noticed that buildimage master is using sairedis 202405 branch instead of siredis/master, i will try to cherrypick SAI v1.15.1 from sairedis master to sairedis/202405, but 202405 is too far behind, v1.14 branch and it have many many changes, which could be channanging to cherypick this single change, will post shortly on that

kcudnik commented 4 days ago

i think it will be safer to create new tag on v1.14.1 SAI that will cherrypick taht RCP change

kcudnik commented 4 days ago

I added v1.14.1 SAI version with included rpc fix https://github.com/opencomputeproject/SAI/commits/v1.14/

kcudnik commented 4 days ago

updated sairedis: https://github.com/sonic-net/sonic-sairedis/pull/1464, as soon as this passes, i will update buildimage

kcudnik commented 3 days ago

Added 2 fixes: https://github.com/sonic-net/sonic-buildimage/pull/20842 https://github.com/sonic-net/sonic-buildimage/pull/20843

hope this will help

kcudnik commented 3 days ago

@jon-nokia are those the branches you see this issues? i posted 2 updates for master and 202405 on buildimage, please confirm

saksarav-nokia commented 3 days ago

@kcudnik , The RPC build was failing only in master and not in 202405

kcudnik commented 3 days ago

intresting , but i updated 2 XD, event with my change its failing on master https://github.com/sonic-net/sonic-buildimage/pull/20842, but not sure if its the same failure or other