TriBITSPub / TriBITS

TriBITS: Tribal Build, Integrate, and Test System,
http://tribits.org
Other
36 stars 46 forks source link

Duplicate library dependencies #341

Open rppawlo opened 3 years ago

rppawlo commented 3 years ago

On cuda static rdc builds, I'm getting a link failure due to duplicate symbols. The main issue is that a library is being included twice on the link line. In the panzer/stk-adapters Dependencies file I have:

SET(LIB_REQUIRED_DEP_PACKAGES SEACASExodus SEACASIoss STKUtil STKTopology STKMesh STKIO Zoltan Ifpack2 MueLu Teko Stratimikos Piro NOX Rythmos PanzerCore PanzerDiscFE)
SET(LIB_OPTIONAL_DEP_PACKAGES Percept)
SET(TEST_REQUIRED_DEP_PACKAGES )
SET(TEST_OPTIONAL_DEP_PACKAGES Pamgen)
SET(LIB_REQUIRED_DEP_TPLS)
SET(LIB_OPTIONAL_DEP_TPLS)
SET(TEST_REQUIRED_DEP_TPLS)
SET(TEST_OPTIONAL_DEP_TPLS)

Panzer declares pamgen as a direct TEST dependency in TEST_OPTIONAL_DEP_PACKAGES. Panzer depends directly on SEACASIoss. SEACASIoss has an optional LIB dependency on pamgen. It seems that when duplicate libraries are removed from the link line, the TEST dependencies and the LIB dependencies are handled separately leading to pamgen being included twice - once in the TEST list and once in the LIB list. The error is below - the pamgen library is included twice. Note that this failure only happens with cuda+rdc+static with tests enabled.

FAILED: packages/panzer/adapters-stk/test/stk_interface_test/PanzerAdaptersSTK_tSTKInterface.exe 
: && /projects/sems/install/rhel7-x86_64/sems/compiler/gcc/7.2.0/openmpi/1.10.1/bin/mpicxx  -g -fdiagnostics-color=auto -fPIC -fPIC --relocatable-device-code=true  -expt-extended-lambda -lineinfo -arch=sm_70  -Xcudafe --diag_suppress=esa_on_defaulted_function_ignored -g -O0   packages/panzer/adapters-stk/test/stk_interface_test/CMakeFiles/PanzerAdaptersSTK_tSTKInterface.dir/tSTKInterface.cpp.o packages/panzer/adapters-stk/test/stk_interface_test/CMakeFiles/PanzerAdaptersSTK_tSTKInterface.dir/__/__/__/__/phalanx/test/Utilities/Phalanx_UnitTestMain.cpp.o  -o packages/panzer/adapters-stk/test/stk_interface_test/PanzerAdaptersSTK_tSTKInterface.exe  packages/panzer/adapters-stk/src/libpanzer-stk.a packages/pamgen/src/libpamgen_extras.a packages/pamgen/src/libpamgen.a packages/panzer/disc-fe/src/libpanzer-disc-fe.a packages/panzer/dof-mgr/src/libpanzer-dof-mgr.a packages/phalanx/src/libphalanx.a packages/panzer/core/src/libpanzer-core.a packages/piro/src/libpiro.a packages/tempus/src/libtempus.a packages/rythmos/src/librythmos.a packages/muelu/adapters/libmuelu-adapters.a packages/muelu/src/Interface/libmuelu-interface.a packages/muelu/src/libmuelu.a packages/intrepid2/src/libintrepid2.a packages/nox/src-loca/src-tpetra/liblocatpetra.a packages/nox/src-loca/src-thyra/liblocathyra.a packages/nox/src-loca/src-epetra/liblocaepetra.a packages/nox/src-loca/src-lapack/liblocalapack.a packages/nox/src-loca/src/libloca.a packages/nox/src-epetra/libnoxepetra.a packages/nox/src-lapack/libnoxlapack.a packages/nox/src/libnox.a packages/percept/libpercept.a packages/stk/stk_expreval/stk_expreval/libstk_expreval.a packages/stk/stk_transfer/stk_transfer/libstk_transfer_impl.a packages/stk/stk_search/stk_search/libstk_search.a packages/stk/stk_math/stk_math/libstk_math.a packages/intrepid/src/libintrepid.a packages/sacado/src/libsacado.a packages/stk/stk_io/stk_io/util/libstk_io_util.a packages/stk/stk_io/stk_io/libstk_io.a packages/stk/stk_mesh/stk_mesh/base/libstk_mesh_base.a packages/shards/src/libshards.a packages/stk/stk_topology/stk_topology/libstk_topology.a packages/stk/stk_util/stk_util/diag/libstk_util_diag.a packages/stk/stk_util/stk_util/command_line/libstk_util_command_line.a packages/stk/stk_util/stk_util/registry/libstk_util_registry.a packages/stk/stk_util/stk_util/environment/libstk_util_env.a packages/stk/stk_util/stk_util/parallel/libstk_util_parallel.a packages/stk/stk_util/stk_util/util/libstk_util_util.a packages/seacas/libraries/aprepro_lib/libaprepro_lib.a packages/teko/src/libteko.a packages/anasazi/tpetra/src/libanasazitpetra.a packages/anasazi/epetra/util/ModeLaplace/libModeLaplace.a packages/anasazi/epetra/src/libanasaziepetra.a packages/anasazi/src/libanasazi.a packages/stratimikos/src/libstratimikos.a packages/stratimikos/adapters/belos/src/libstratimikosbelos.a packages/stratimikos/adapters/amesos2/src/libstratimikosamesos2.a packages/stratimikos/adapters/aztecoo/src/libstratimikosaztecoo.a packages/stratimikos/adapters/amesos/src/libstratimikosamesos.a packages/stratimikos/adapters/ml/src/libstratimikosml.a packages/stratimikos/adapters/ifpack/src/libstratimikosifpack.a packages/ml/src/libml.a packages/galeri/src-xpetra/libgaleri-xpetra.a packages/galeri/src-epetra/libgaleri-epetra.a packages/ifpack/src/libifpack.a packages/amesos/src/libamesos.a packages/ifpack2/adapters/libifpack2-adapters.a packages/ifpack2/src/libifpack2.a packages/amesos2/src/libamesos2.a packages/common/auxiliarySoftware/SuiteSparse/src/libtrilinosss.a packages/belos/xpetra/src/libbelosxpetra.a packages/belos/tpetra/src/libbelostpetra.a packages/belos/epetra/src/libbelosepetra.a packages/belos/src/libbelos.a packages/aztecoo/src/libaztecoo.a packages/zoltan2/core/src/libzoltan2.a packages/xpetra/sup/libxpetra-sup.a packages/xpetra/src/libxpetra.a packages/thyra/adapters/tpetra/src/libthyratpetra.a packages/thyra/adapters/epetraext/src/libthyraepetraext.a packages/epetraext/src/libepetraext.a packages/triutils/src/libtriutils.a packages/thyra/adapters/epetra/src/libthyraepetra.a packages/thyra/core/src/libthyracore.a packages/rtop/src/librtop.a packages/tpetra/core/ext/libtpetraext.a packages/tpetra/core/inout/libtpetrainout.a packages/tpetra/core/src/libtpetra.a packages/epetra/src/libepetra.a packages/tpetra/tsqr/src/libkokkostsqr.a /projects/sems/install/rhel7-x86_64/sems/compiler/cuda/9.2/base/lib64/libcusolver.so packages/tpetra/classic/LinAlg/libtpetraclassiclinalg.a packages/tpetra/classic/NodeAPI/libtpetraclassicnodeapi.a packages/tpetra/classic/src/libtpetraclassic.a packages/teuchos/kokkoscomm/src/libteuchoskokkoscomm.a packages/teuchos/kokkoscompat/src/libteuchoskokkoscompat.a packages/teuchos/remainder/src/libteuchosremainder.a packages/teuchos/numerics/src/libteuchosnumerics.a packages/teuchos/comm/src/libteuchoscomm.a packages/teuchos/parameterlist/src/libteuchosparameterlist.a packages/teuchos/parser/src/libteuchosparser.a packages/teuchos/core/src/libteuchoscore.a packages/kokkos-kernels/src/libkokkoskernels.a /usr/lib64/liblapack.so.3 /usr/lib64/libblas.so.3 packages/seacas/libraries/ioss/src/main/libio_info_lib.a packages/seacas/libraries/ioss/src/init/libIonit.a packages/seacas/libraries/ioss/src/transform/libIotr.a packages/seacas/libraries/ioss/src/heartbeat/libIohb.a packages/seacas/libraries/ioss/src/gen_struc/libIogs.a packages/seacas/libraries/ioss/src/generated/libIogn.a packages/seacas/libraries/ioss/src/visualization/libIovs.a packages/seacas/libraries/ioss/src/pamgen/libIopg.a packages/seacas/libraries/ioss/src/exodus/libIoex.a packages/seacas/libraries/ioss/src/libIoss.a packages/pamgen/src/libpamgen_extras.a packages/pamgen/src/libpamgen.a packages/kokkos/algorithms/src/libkokkosalgorithms.a packages/kokkos/containers/src/libkokkoscontainers.a packages/kokkos/core/src/libkokkoscore.a /projects/sems/install/rhel7-x86_64/sems/compiler/cuda/9.2/base/lib64/libcudart.so /projects/sems/install/rhel7-x86_64/sems/compiler/cuda/9.2/base/lib64/libcublas.so /projects/sems/install/rhel7-x86_64/sems/compiler/cuda/9.2/base/lib64/libcufft.so packages/seacas/libraries/exodus/libexodus.a /projects/sems/install/rhel7-x86_64/sems/tpl/netcdf/4.7.3/gcc/7.2.0/openmpi/1.10.1/parallel/lib/libnetcdf.a /projects/sems/install/rhel7-x86_64/sems/tpl/netcdf/4.7.3/gcc/7.2.0/openmpi/1.10.1/parallel/lib/libpnetcdf.a -lcurl /projects/sems/install/rhel7-x86_64/sems/tpl/hdf5/1.10.6/gcc/7.2.0/openmpi/1.10.1/parallel/lib/libhdf5_hl.a /projects/sems/install/rhel7-x86_64/sems/tpl/hdf5/1.10.6/gcc/7.2.0/openmpi/1.10.1/parallel/lib/libhdf5.a /projects/sems/install/rhel7-x86_64/sems/tpl/zlib/1.2.8/gcc/7.2.0/base/lib/libz.a -ldl packages/zoltan/src/libzoltan.a -lm && :
/projects/sems/install/rhel7-x86_64/sems/compiler/cuda/9.2/base/bin/crt/link.stub:92:118: error: redefinition of ‘const unsigned char def_module_id_str_48_tmpxft_00021927_00000000_6_pamgen_extras_cpp1_ii_68d892e7 []’
 static const unsigned char __DEFSTRNAME(id) [] __attribute__((aligned(1))) __attribute__((section ("__nv_module_id"))) = "def " __TO_STRING(id); \
                                                                                                                      ^
/scratch/rppawlo/tmp/tmpxft_0001dc20_00000000-2_PanzerAdaptersSTK_tSTKInterface_dlink.reg.c:2367:1: note: in expansion of macro ‘DEFINE_REGISTER_FUNC’
 DEFINE_REGISTER_FUNC(_48_tmpxft_00021927_00000000_6_pamgen_extras_cpp1_ii_68d892e7)
 ^~~~~~~~~~~~~~~~~~~~

Any recomendations? In the short term, I can work around by disabling panzer tests if SEACASIoss_ENABLE_Pamgen is off.

bartlettroscoe commented 3 years ago

@rppawlo, does nvcc_wrapper not strip out duplicate library symbols?

I could spend some time debugging now this but all of this code is going to rewritten when I get deep into working #63 and #299 next month (assuming I can get off the ATDM Trilinos builds for real).

So you try the workaround for now?

rppawlo commented 3 years ago

I'm not sure about nvcc, but I verified that manually removing the second include of the library fixed the issue. Workaround is fine for now.

rppawlo commented 3 years ago

@bartlettroscoe - yesterday you pointed out that if RDC is enabled, tribits should add a flag to nvcc wrapper to remove duplicates. I went back and looked at why this flag was not getting added. From what I can tell, if the user specifies CMAKE_CXX_FLAGS during configure then the addition of the remove duplicates flag gets dropped inside tribits. Is that the intention or is this a bug? It blocks me from adding cuda compiler flags for warnings during debugging. I then have to manually add the remove duplicates flag as well.

bartlettroscoe commented 3 years ago

yesterday you pointed out that if RDC is enabled, tribits should add a flag to nvcc wrapper to remove duplicates.

@rppawlo, actually, this is added by the ATDM Trilinos configuration system, not TriBITS, that does this as documented at:

and implemented at:

Can you provide reproducability instructions?

rppawlo commented 3 years ago

just sent you something offline

bartlettroscoe commented 3 years ago

As described in:

the problem was due to some confusion/missing documentation on how to tweak CMAKE_CXX_FLAGS.

But yet, TriBITS should not produce link lines with duplicate libraries. Hopefully the refactorings for #63 and #299 will resolve that by handing this over to CMake to deal with.

bartlettroscoe commented 3 years ago

Related to: