lattice / quda

QUDA is a library for performing calculations in lattice QCD on GPUs.
https://lattice.github.io/quda
Other
279 stars 94 forks source link

Optionally use QMP for multi-GPU staggered #3

Closed bjoo closed 13 years ago

bjoo commented 13 years ago

When building staggered with --with-qmp= ... flag in multi-gpu mode I get unresolved symbols (see at end of message)

Issue is not present when compiling pure MPI (--with-mpi=... , but no --with-qmp) Code also builds fine when multi-gpu is disabled (absence of --enable-multi-gpu)

This is probably just some comms feature that never made it to a QMP version.

Unresolved symbols from the failing case are below:

/home/bjoo/Toolchain/install/openmpi-1.5/bin/mpicxx -fPIC -L/usr/local/cuda/lib64 -lcudart -L/home/bjoo/Devel/QCD/install/qmp/qmp2-1-6/openmpi/lib -lqmp su3_test.o test_util.o wilson_dslash_reference.o ../lib/libquda.a -o su3_test -fPIC -L/usr/local/cuda/lib64 -lcudart -L/home/bjoo/Devel/QCD/install/qmp/qmp2-1-6/openmpi/lib -lqmp ../lib/libquda.a(dslash_quda.o): In function void staggeredDslashNoReconCuda<2, short2, short2, short2>(short2*, float*, short2 const*, short2 const*, short2 const*, short2 const*, QudaReconstructType_s, short2 const*, float const*, int, int, short2 const*, float const*, double const&, int, int, int, int, int, cudaColorSpinorField*, dim3)': tmpxft_000053e3_00000000-1_dslash_quda.cudafe1.cpp:(.text._Z26staggeredDslashNoReconCudaILi2E6short2S0_S0_EvPT0_PfPKT1_S6_PKT2_S9_21QudaReconstructType_sPKS1_PKfiiSC_SE_RKdiiiiiP20cudaColorSpinorField4dim3[void staggeredDslashNoReconCuda<2, short2, short2, short2>(short2*, float*, short2 const*, short2 const*, short2 const*, short2 const*, QudaReconstructType_s, short2 const*, float const*, int, int, short2 const*, float const*, double const&, int, int, int, int, int, cudaColorSpinorField*, dim3)]+0x19f): undefined reference toexchange_gpu_spinor_start' tmpxft_000053e3_00000000-1_dslash_quda.cudafe1.cpp:(.text._Z26staggeredDslashNoReconCudaILi2E6short2S0_S0_EvPT0_PfPKT1_S6_PKT2_S9_21QudaReconstructType_sPKS1_PKfiiSC_SERKdiiiiiP20cudaColorSpinorField4dim3[void staggeredDslashNoReconCuda<2, short2, short2, short2>(short2, float, short2 const, short2 const, short2 const, short2 const_, QudaReconstructTypes, short2 const, float const, int, int, short2 const, float const, double const&, int, int, int, int, int, cudaColorSpinorField, dim3)]+0x1ac): undefined reference to exchange_gpu_spinor_wait' tmpxft_000053e3_00000000-1_dslash_quda.cudafe1.cpp:(.text._Z26staggeredDslashNoReconCudaILi2E6short2S0_S0_EvPT0_PfPKT1_S6_PKT2_S9_21QudaReconstructType_sPKS1_PKfiiSC_SE_RKdiiiiiP20cudaColorSpinorField4dim3[void staggeredDslashNoReconCuda<2, short2, short2, short2>(short2_, float_, short2 const_, short2 const_, short2 const_, short2 const_, QudaReconstructType_s, short2 const_, float const_, int, int, short2 const_, float const_, double const&, int, int, int, int, int, cudaColorSpinorField_, dim3)]+0x2b0): undefined reference toexchange_gpu_spinor_start' tmpxft_000053e3_00000000-1_dslash_quda.cudafe1.cpp:(.text._Z26staggeredDslashNoReconCudaILi2E6short2S0_S0_EvPT0_PfPKT1_S6_PKT2_S9_21QudaReconstructType_sPKS1_PKfiiSC_SE_RKdiiiiiP20cudaColorSpinorField4dim3[void staggeredDslashNoReconCuda<2, short2, short2, short2>(short2, float, short2 const, short2 const, short2 const, short2 const, QudaReconstructType_s, short2 const, float const, int, int, short2 const, float const, double const&, int, int, int, int, int, cudaColorSpinorField*, dim3)]+0x2bd): undefined reference to`exchange_gpu_spinor_wait' tmpxft_000053e3_00000000-1_dslash_quda.cudafe1.cpp:(.te

gshi commented 13 years ago

Balint, staggered code does not support QMP. Multi-gpu staggered must go with pure MPI for now.

gshi commented 13 years ago

The support may be added in the future. I will close it for now.

rbabich commented 13 years ago

Just re-opening this as a feature request...

maddyscientist commented 13 years ago

An update here. The library should at least build now with qmp and staggered both enabled. The tests programs will fail to link though. Unifying the communications interface is my next task, so hopefully this should be fixed soon.

bjoo commented 13 years ago

Well, the original issue was about linking the test programs really.

maddyscientist commented 13 years ago

This issue has now been fixed. My latest additions add support for staggered with QMP, and Wilson with MPI.