JeffersonLab / qphix

QCD for Intel Xeon Phi and Xeon processors
http://jeffersonlab.github.io/qphix/
Other
13 stars 11 forks source link

Make the CommsUtils functions inline #21

Closed martin-ueding closed 7 years ago

martin-ueding commented 7 years ago

I tried to compile the current QPhiX devel branch (b1dbed4) together with the current Chroma devel branch (62cf4d341f5e4811637d37e54a9345598faea848). This has failed due to multiple definition errors like this:

../../lib/libchroma.a(syssolver_mdagm_aggregate.o): In function `QPhiX::CommsUtils::sumDouble(double*)':
syssolver_mdagm_aggregate.cc:(.text+0x180): multiple definition of `QPhiX::CommsUtils::sumDouble(double*)'
../../lib/libchroma.a(syssolver_linop_aggregate.o):syssolver_linop_aggregate.cc:(.text+0x1d0): first defined here

Drilling down into QPhiX I noticed that the file include/comm.h is the only source of a definition for this function. There are three preprocessor branches that are in this file. I have added a sentinel in the previous commit such that one cannot mess it up.

The culprit seems to be that the function definitions are not inline. Having a member function defined within the class automatically makes it inline. Those are no problem in the header. The functions in question however are only free functions in a namespace. Their definition in the header does not automatically make them inline, yielding to a multiple definition error as seen.

I opted for adding inline as those functions just delegate to QDP++ if they do anything at all. The other way would be to split off the definition form the declaration into a .cc file.

bjoo commented 7 years ago

Hi Martin, You need the following:

i) QDP++ devel branch ii) CHroma devel branch iii) QPhiX devel branch.

The master branch of QPhiX in particular is very antiquated.

However, please note that QPhiX can only function as a ‘native’ dslash (instead of say cpp_wilson_dslash) if QDP-JIT/LLVM is used underneath chroma. This is because QPhIX and QDP++/Chroma use different data layouts by default. QDP-JIT/LLVM allows the QDP index order to be changed to match QPhiX, but the regular Chroma does not allow this. So if you are trying to build chroma over regular QDP++ you should not enable the —enable-qphix-dslash options to Chroma. If you use QPhiX with Chroma over regular QDP++ you should still be able to use the QPhiX solvers, although for Twisted Mass you may need a solver wrapper to add to Chroma (the same way Clover is currently wrapped).

To answer your other previous question, yes, QPhiX originally started off from cpp_wilson_dslash long long ago, as that was a code I had confidence in, but really it is quite different now.

With very best wishes, Balint

On Dec 1, 2016, at 6:17 AM, Martin Ueding notifications@github.com wrote:

I tried to compile the current QPhiX devel branch (b1dbed4) together with the current Chroma devel branch (62cf4d341f5e4811637d37e54a9345598faea848). This has failed due to multiple definition errors like this:

../../lib/libchroma.a(syssolver_mdagm_aggregate.o): In function QPhiX::CommsUtils::sumDouble(double)': syssolver_mdagm_aggregate.cc:(.text+0x180): multiple definition ofQPhiX::CommsUtils::sumDouble(double)' ../../lib/libchroma.a(syssolver_linop_aggregate.o):syssolver_linop_aggregate.cc:(.text+0x1d0): first defined here

Drilling down into QPhiX I noticed that the file include/comm.h is the only source of a definition for this function. There are three preprocessor branches that are in this file. I have added a sentinel in the previous commit such that one cannot mess it up.

The culprit seems to be that the function definitions are not inline. Having a member function defined within the class automatically makes it inline. Those are no problem in the header. The functions in question however are only free functions in a namespace. Their definition in the header does not automatically make them inline, yielding to a multiple definition error as seen.

I opted for adding inline as those functions just delegate to QDP++ if they do anything at all. The other way would be to split off the definition form the declaration into a .cc file.

You can view, comment on, or merge this pull request online at:

https://github.com/JeffersonLab/qphix/pull/21

Commit Summary

• Add a sanity check for QPHIX_FAKE_COMMS and QPHIX_QMP_COMMS • Make the CommsUtils functions inline File Changes

• M include/qphix/comm.h (25) Patch Links:

https://github.com/JeffersonLab/qphix/pull/21.patchhttps://github.com/JeffersonLab/qphix/pull/21.diff — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.


Dr Balint Joo High Performance Computational Scientist Jefferson Lab 12000 Jefferson Ave, Suite 3, MS 12B2, Room F217, Newport News, VA 23606, USA Tel: +1-757-269-5339, Fax: +1-757-269-5427 email: bjoo@jlab.org

martin-ueding commented 7 years ago

Would it make sense to rename the devel branch to master and drop the old master branch in Chroma, QDP++ and QPhiX, then? Or perhaps the devel should be merged into master whenever it is stable enough to compile?

I can currently not test whether switching QDP++ from master to devel will remove this issue here. I mean the code in question is QPhiX devel, so I think the patch is still neded. When the Intel Xeon cluster is back online, I can test it and provide an update.

My mental picture of all this software is still a bit blurry. Do I understand it correctly that QPhiX provides Dslash and solvers and that I can choose the Dslash from QPhiX or cpp_wilson_dslash (barring a complication with QDP++ and QDP-JIT)? If I choose neither of them, does Chroma (or QDP++) has a “default”/“fallback” solver and Dslash implemented? When using Chroma, QDP++, and QPhiX, do I have to adjust the input XML to benefit from QPhiX or will hmc resolve CG_INVERTER to the best implementation that has been compiled into hmc?

martin-ueding commented 7 years ago

The issue still persists with the devel branches. I tried to bootstrap a Chroma installation with QPhiX and QDP++ on Intel Xeon Haswell today and on Intel i5 Sandy Bridge yesterday. The functions are still duplicated somewhere in libchroma.a.

Can you get this build correctly without inlining the functions?

martin-ueding commented 7 years ago

I currently try to get Chroma running on the Marconi (Intel KNL) cluster in Italy. There I have the exact same linker error in libchroma.a when using the devel versions of Chroma, QDP++, and QPhiX. This patch is needed for me to get it to compile.

I truly do not understand how it compiles on your machines. Since those functions are non-member non-template functions implemented in the header, there must be linker errors. This also happens with GCC and Intel C++ on three different machines now.

Could this please be included in the canonical devel branch?