sonic-net / sonic-sairedis

SAI object interface to Redis database, as used in the SONiC project
Other
56 stars 253 forks source link

[build]sonic-sairedis build fails sometimes while building metadata #1294

Open dgsudharsan opened 9 months ago

dgsudharsan commented 9 months ago

Description

sonic-sairedis build fails sometimes with the below logs

[2023-09-26T21:00:05.873Z] make[4]: Leaving directory '/sonic/src/sonic-sairedis/SAI/meta'
[2023-09-26T21:00:05.873Z] make  install-am
[2023-09-26T21:00:05.873Z] make[4]: Entering directory '/sonic/src/sonic-sairedis/meta'
[2023-09-26T21:00:05.873Z] make -C ../SAI/meta saimetadata.c
[2023-09-26T21:00:05.873Z] make[5]: Entering directory '/sonic/src/sonic-sairedis/SAI/meta'
[2023-09-26T21:00:05.873Z] make[5]: 'saimetadata.c' is up to date.
[2023-09-26T21:00:05.873Z] make[5]: Leaving directory '/sonic/src/sonic-sairedis/SAI/meta'
[2023-09-26T21:00:05.874Z] /bin/bash ../libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H -I. -I..   -Wdate-time -D_FORTIFY_SOURCE=2 -g -I../SAI/inc -I../SAI/experimental -I../SAI/meta -ansi  -g -O2 -ffile-prefix-map=/sonic/src/sonic-sairedis=. -fstack-protector-strong -Wformat -Werror=format-security -c -o ../SAI/meta/libsaimetadata_la-saimetadata.lo `test -f '../SAI/meta/saimetadata.c' || echo './'`../SAI/meta/saimetadata.c
[2023-09-26T21:00:05.874Z] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -Wdate-time -D_FORTIFY_SOURCE=2 -g -I../SAI/inc -I../SAI/experimental -I../SAI/meta -ansi -g -O2 -ffile-prefix-map=/sonic/src/sonic-sairedis=. -fstack-protector-strong -Wformat -Werror=format-security -c ../SAI/meta/saimetadata.c  -fPIC -DPIC -o ../SAI/meta/.libs/libsaimetadata_la-saimetadata.o
[2023-09-26T21:00:05.875Z] libtool: compile:  gcc -DHAVE_CONFIG_H -I. -I.. -Wdate-time -D_FORTIFY_SOURCE=2 -g -I../SAI/inc -I../SAI/experimental -I../SAI/meta -ansi -g -O2 -ffile-prefix-map=/sonic/src/sonic-sairedis=. -fstack-protector-strong -Wformat -Werror=format-security -c ../SAI/meta/saimetadata.c -o ../SAI/meta/libsaimetadata_la-saimetadata.o >/dev/null 2>&1
[2023-09-26T21:00:05.875Z] /bin/bash ../libtool  --tag=CC   --mode=link gcc -g -I../SAI/inc -I../SAI/experimental -I../SAI/meta -ansi  -g -O2 -ffile-prefix-map=/sonic/src/sonic-sairedis=. -fstack-protector-strong -Wformat -Werror=format-security  -Wl,-z,relro -o libsaimetadata.la -rpath /usr/lib/x86_64-linux-gnu ../SAI/meta/libsaimetadata_la-saimetadata.lo ../SAI/meta/libsaimetadata_la-saimetadatautils.lo ../SAI/meta/libsaimetadata_la-saiserialize.lo  
[2023-09-26T21:00:05.875Z] libtool: link: rm -fr  .libs/libsaimetadata.a .libs/libsaimetadata.la .libs/libsaimetadata.lai .libs/libsaimetadata.so .libs/libsaimetadata.so.0 .libs/libsaimetadata.so.0.0.0
[2023-09-26T21:00:05.876Z] libtool: link: gcc -shared  -fPIC -DPIC  ../SAI/meta/.libs/libsaimetadata_la-saimetadata.o ../SAI/meta/.libs/libsaimetadata_la-saimetadatautils.o ../SAI/meta/.libs/libsaimetadata_la-saiserialize.o    -g -g -O2 -fstack-protector-strong -Wl,-z -Wl,relro   -Wl,-soname -Wl,libsaimetadata.so.0 -o .libs/libsaimetadata.so.0.0.0
[2023-09-26T21:00:05.876Z] /usr/bin/ld: cannot find ../SAI/meta/.libs/libsaimetadata_la-saimetadata.o: file format not recognized
[2023-09-26T21:00:05.876Z] collect2: error: ld returned 1 exit status
[2023-09-26T21:00:05.876Z] make[4]: *** [Makefile:577: libsaimetadata.la] Error 1
[2023-09-26T21:00:05.876Z] make[4]: Leaving directory '/sonic/src/sonic-sairedis/meta'
[2023-09-26T21:00:05.876Z] make[3]: *** [Makefile:975: install] Error 2
[2023-09-26T21:00:05.876Z] make[3]: Leaving directory '/sonic/src/sonic-sairedis/meta'
[2023-09-26T21:00:05.876Z] make[2]: *** [Makefile:446: install-recursive] Error 1
[2023-09-26T21:00:05.876Z] make[2]: Leaving directory '/sonic/src/sonic-sairedis'
[2023-09-26T21:00:05.876Z] dh_auto_install: error: make -j1 install DESTDIR=/sonic/src/sonic-sairedis/debian/tmp AM_UPDATE_INFO_DIR=no returned exit code 2
[2023-09-26T21:00:05.876Z] make[1]: *** [debian/rules:51: binary-sairedis] Error 25
[2023-09-26T21:00:05.876Z] make[1]: Leaving directory '/sonic/src/sonic-sairedis'
[2023-09-26T21:00:05.877Z] dpkg-buildpackage: error: fake******** debian/rules binary-sairedis subprocess returned exit status 2
[2023-09-26T21:00:05.877Z] [  FAIL LOG END  ] [ target/debs/bullseye/libsairedis_1.0.0_amd64.deb ]
[2023-09-26T21:00:05.877Z] make: *** [slave.mk:724: target/debs/bullseye/libsairedis_1.0.0_amd64.deb] Error 1
[2023-09-26T21:00:05.877Z] make: *** Waiting for unfinished jobs....

Attaching the full log of sonic-sairedis with and without cache sairedis_buildfail.log

sairedis_buildfail_without_cache.log

Steps to reproduce the issue:

  1. Build sairedis with cache enabled

Describe the results you received:

Building sairedis fails

Describe the results you expected:

No build failure is expected

Output of show version:

Built on 202305 branch.

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

judyjoseph commented 9 months ago

@kcudnik could you check this in 202305 branch

kcudnik commented 9 months ago

Build fails sometines? And sometimes works ?

kcudnik commented 9 months ago

Which branch it fails ?

dgsudharsan commented 9 months ago

This is 202305. I believe it fails sometimes.

kcudnik commented 9 months ago

Is this built on pipeline or locally ?

dgsudharsan commented 9 months ago

@kcudnik This is built through pipeline.

kcudnik commented 9 months ago

if it fails sometimes then there is some race condition or instability in the pipeline if the build fails

kcudnik commented 9 months ago

what does it mean to build sairedis with cache enabled? what cache ?

dgsudharsan commented 9 months ago

We do have a way to build with cache enabled. We specify these parameters during make SONIC_DPKG_CACHE_SOURCE=/path/to/cache SONIC_DPKG_CACHE_METHOD=cache

kcudnik commented 9 months ago

Maybe not everything is cached if it sometimes passes

dgsudharsan commented 9 months ago

One more thing is, it is seen sometimes without cache too. I have shared those logs. I believe this issue happens due to parallel build. May be should we need to set some ordering to avoid the issue. As you can see the linker cannot find libsaimetadata_la-saimetadata.o. However this target exists and may be not built when linker executes?

kcudnik commented 9 months ago

Parallel build is not supported

dgsudharsan commented 9 months ago

@kcudnik If that's the case can you please raise fix to ensure sonic-sairedis is explicitly overriding parallel build using build options? Currently it picks from the sonic-build system DEB_BUILD_OPTIONS='nocheck parallel=30'

kcudnik commented 9 months ago

Parallel check was never tested, will need to investigate how to fix that, in the meantime can we disable parallel build ?

dgsudharsan commented 8 months ago

sairedis_buildfail_with_debug.txt @saiarcot895 @kcudnik I added the debug https://github.com/dgsudharsan/sonic-sairedis/pull/8 and with this I was able to hit the sairedis failure again. Could you please check if there is some information available to further debug on this?

kcudnik commented 8 months ago

can you build this without parallel on your side ?

[2023-10-31T00:52:43.279Z] In file included from ../pysairedis.cpp:6:
[2023-10-31T00:52:43.280Z] ../../meta/sai_serialize.h:5:10: fatal error: saimetadata.h: No such file or directory
[2023-10-31T00:52:43.280Z]     5 | #include "saimetadata.h"
[2023-10-31T00:52:43.280Z]       |          ^~~~~~~~~~~~~~~
[2023-10-31T00:52:43.280Z] compilation terminated.

this seems like python targets are build before sairedis/meta, currently i don't know how to set any dependencies in parallel build that some specific targets would need to be build before some other targets, i don't know make enough for this, and our pipelines are building single threaded, and everything is working fine

is that a big slowdown on your side ?

and how you are starting your build in sairedis ? what command/switches you are adding to make/configure ?

ridahanif96 commented 7 months ago

@dgsudharsan did you find a way to solve this ? i am facing similar issue for 202305 build, can you please help?

kcudnik commented 7 months ago

If python build is build first before SAI then metadata is not existing, Im not familiar with Makefile dependencies to add them, but almost all of the projects depend on SAI here, @dgsudharsan how you enabled parallel build ? So far we didn't have parallelism enabled and everything was fine, is it some automatic change in az pipeline? I have limited access untill the end of the year, I'm on vacation

kcudnik commented 7 months ago

We can use explicit dependency on each target so it will always execute in specific order, like here https://stackoverflow.com/questions/9159960/order-of-processing-components-in-makefile but it defeats entire purpose of parallelism, so not sure if this make sense at all to run in multithread env, we could try to build SAI metadata first, and then check whether multi core pass, and add new restrictions, python extensions and tests should be compiled at the end