qsimulate-open / bagel

Brilliantly Advanced General Electronic-structure Library
GNU General Public License v3.0
96 stars 43 forks source link

Failure to build v12.2 with ICC 19.0.4, GCC-8.3, Boost 1.71.0: no instance of constructor std::pair #182

Closed drhpc closed 5 years ago

drhpc commented 5 years ago

I have trouble building Bagel 1.2.2 on a CentOS 7.6 cluster. The error looks like a basic syntax issue, but I am not sure who is to blame and searching for the template error message does not yield anything useful.

So even if this is no issue in bagel itself, I'm grateful for any help getting it to build for my users.

The core failure is this:

/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/nvp.hpp(48): error: no instance of constructor "std::pair<_T1, _T2>::pair [with _T1=const
 char *, _T2=int (*)[3]]" matches the argument list
            argument types are: (const char *, int (*)[3])
          std::pair<const char *, T *>(name_, boost::addressof(t))
                                      ^
/sw/compiler/gcc-8.3.0/include/c++/8.3.0/bits/stl_pair.h(436): note: this candidate was rejected because mismatch in count of arguments
          pair(tuple<_Args1...>&, tuple<_Args2...>&,

… lots of candidates follow, until:

/sw/compiler/gcc-8.3.0/include/c++/8.3.0/bits/stl_pair.h(242): note: this candidate was rejected because mismatch in count of arguments
        explicit constexpr pair()
                           ^
/sw/compiler/gcc-8.3.0/include/c++/8.3.0/bits/stl_pair.h(229): note: this candidate was rejected because mismatch in count of arguments
        _GLIBCXX_CONSTEXPR pair()
                           ^
          detected during:
            instantiation of "boost::serialization::nvp<T>::nvp(const char *, T &) [with T=int [3]]" at line 82
            instantiation of "const boost::serialization::nvp<T> boost::serialization::make_nvp(const char *, T &) [with T=int [3]]" at line 41 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/array.hpp"
            instantiation of "void boost::serialization::serialize(Archive &, std::array<T, N> &, unsigned int) [with Archive=boost::archive::binary_iarchive, T=int, N=3UL]" at line 126 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/serialization.hpp"
            instantiation of "void boost::serialization::serialize_adl(Archive &, T &, unsigned int) [with Archive=boost::archive::binary_iarchive, T=std::array<int, 3UL>]" at line 191 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/archive/detail/iserializer.hpp"
            instantiation of "void boost::archive::detail::iserializer<Archive, T>::load_object_data(boost::archive::detail::basic_iarchive &, void *, unsigned int) const [with Archive=boost::archive::binary_iarchive, T=std::array<int, 3UL>]" at line 134 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/archive/detail/iserializer.hpp"
            [ 246 instantiation contexts not shown ]
            instantiation of "const T &boost::serialization::singleton<T>::get_const_instance() [with T=boost::archive::detail::pointer_iserializer<boost::archive::binary_iarchive, bagel::Reference>]" at line 62 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/export.hpp"
            instantiation of "const boost::archive::detail::basic_pointer_iserializer &boost::archive::detail::export_impl<Archive, Serializable>::enable_load(boost::mpl::true_) [with Archive=boost::archive::binary_iarchive, Serializable=bagel::Reference]" at line 105 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/export.hpp"
            instantiation of "void boost::archive::detail::ptr_serialization_support<Archive, Serializable>::instantiate() [with Archive=boost::archive::binary_iarchive, Serializable=bagel::Reference]" at line 121 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/export.hpp"
            instantiation of "void boost::archive::detail::extra_detail::guid_initializer<T>::export_guid(boost::mpl::false_) const [with T=bagel::Reference]" at line 131 of "/sw/env/gcc8intel-19.0.4_impi/boost/1.71.0/include/boost/serialization/export.hpp"
            instantiation of "const boost::archive::detail::extra_detail::guid_initializer<T> &boost::archive::detail::extra_detail::guid_initializer<T>::export_guid() const [with T=bagel::Reference]" at line 32 of "reference.cc"

compilation aborted for reference.cc (code 2)
make[3]: *** [reference.lo] Error 1
make[3]: Leaving directory `/scratch/sw/work/gcc8intel-19.0.4_impi/bagel/1.2.2/bagel-1.2.2/src/wfn'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/scratch/sw/work/gcc8intel-19.0.4_impi/bagel/1.2.2/bagel-1.2.2/src'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/scratch/sw/work/gcc8intel-19.0.4_impi/bagel/1.2.2/bagel-1.2.2'
make: *** [all] Error 2

See the full build process in the attached log.

bagel-1.2.2.build.log

shiozaki commented 5 years ago

Hi, I haven't been able to try out Boost 1.71.0 (the latest I tried is 1.70.0). Could you try an earlier boost and see if you have the same issue? The problem doesn't really seem to be in BAGEL, but I cannot assert that either.

drhpc commented 5 years ago

Thanks for the quick note. OK, I'll build Boost 1.70.0 and try again. Will be back on Monday, probably.

shiozaki commented 5 years ago

Sure - the other possibility is ICC doing something wrong. If that's the case, Intel MPI + MKL with GCC would do. Let me know how it goes.

drhpc commented 5 years ago

Same issue with Boost 1.70.0. Can you parse the error message? It looks like quite basic API breakage to me, but I am reminded that my tolerance for C++ template error poetry is low.

I wonder if this possibly could be triggered by some call to a Boost template from Bagel, where said call could be phrased differently to pick up the correct template. Some casting fun? I guess the more meaningful lines are those like

/sw/compiler/gcc-8.3.0/include/c++/8.3.0/bits/stl_pair.h(260): note: this candidate was rejected because at least one template argument could not be deduced
        constexpr pair(const _T1& __a, const _T2& __b)

I wonder what that means, exactly. I'd thought that basic breakage on the Intel Compiler side using GCC's STL headers would have shown itself earlier when building Boost.

I must admit that I don't fancy trying to coerce GCC into linking with MKL and Intel MPI. It's a strange combination. I might try an earlier GCC that does not make Boost complain about C++11 not being supported by default.

shiozaki commented 5 years ago

o Further inspection of the log indicates that your compiler and boost are not compatible. You could try the following with appropriate link options.

o GCC + Intel MPI/MKL is a very standard practice (because it's the best free option), and many of the BAGEL users have been using it.

o Generally, we do not provide user support for this open-source version, unless there is a bug in the code. A commercial version from QSimulate will be released relatively soon with many upgrades and support. If you are interested in, please write to info@qsimulate.com

#include <fstream>
#include <boost/serialization/serialization.hpp>
#include <boost/serialization/array.hpp>

class A { 
  std::array<int,3> a;

  friend class boost::serialization::access;
  template<class Archive> void serialize(Archive& ar, const unsigned int) { ar & a; }
};

#include <boost/archive/binary_oarchive.hpp>

int main() {
  A a;
  std::ofstream of("tmp");
  boost::archive::binary_oarchive oa(of);
  oa << a;
}
drhpc commented 5 years ago

Regarding free options, I'd rather eye GCC+OpenBLAS+OpenMPI. I honestly was not really aware of MKL being free to use. I've always seen it as part of the licensed Intel compiler suite at the sites I worked at. But that is off-topic here.

The code fragment you posted, is that supposed to be a test for my broken compiler setup? I can build that just fine, with the compiler/library setup apparent from

$ ldd boosttest
        linux-vdso.so.1 =>  (0x00007fffe0a64000)
        libboost_serialization.so.1.70.0 => /sw/env/gcc8intel-19.0.4_impi/boost/1.70.0/lib/libboost_serialization.so.1.70.0 (0x000014cb197b0000)
        libstdc++.so.6 => /sw/compiler/gcc-8.3.0/lib64/libstdc++.so.6 (0x000014cb19623000)
        libm.so.6 => /lib64/libm.so.6 (0x000014cb192f0000)
        libgcc_s.so.1 => /sw/compiler/gcc-8.3.0/lib64/libgcc_s.so.1 (0x000014cb192d6000)
        libc.so.6 => /lib64/libc.so.6 (0x000014cb18f09000)
        libdl.so.2 => /lib64/libdl.so.2 (0x000014cb18d05000)
        librt.so.1 => /lib64/librt.so.1 (0x000014cb18afd000)
        libimf.so => /sw/compiler/intel-19.0.4/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libimf.so (0x000014cb1855d000)
        libsvml.so => /sw/compiler/intel-19.0.4/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libsvml.so (0x000014cb16bb9000)
        libirng.so => /sw/compiler/intel-19.0.4/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libirng.so (0x000014cb16847000)
        libintlc.so.5 => /sw/compiler/intel-19.0.4/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin/libintlc.so.5 (0x000014cb165d5000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x000014cb163b9000)
        /lib64/ld-linux-x86-64.so.2 (0x000014cb195fc000)

I also looked at the output of icpc -E and it does use the complaining header /sw/compiler/gcc-8.3.0/include/c++/8.3.0/bits/stl_pair.h, which is silent for this case.

Did you intend to post a different code example that is problematic?

About support: My intention is to get the software built by user request at our HPC site. When you don't see the possibility of a fault in Bagel code or the build system, I understand that you cannot help. Right now, I don't have strong indication at the culprit. A commercial version is likely to be paired with the Intel compiler suite which uses some modern GCC for the libstdc++ part (unless you decide to fully go for binary releases).

drhpc commented 5 years ago

To wrap this up: I saw this hint in the EasyBuild receipe (which I don't use):

# Note: A compiler bug(?) in template deduction prevents newer versions of icpc to compile this software.
toolchain = {'name': 'intel', 'version': '2016b'}

Too bad that nobody seems to follow up on this. People don't seem to fancy getting Intel to fix compiler bugs.

I now simply switched CXX, I_MPI_CXX etc. to GCC and built bagel in the same environment I used before. It picked up the correct Intel libraries and finished the build using the GNU compiler.

I need to convince it to burn in the correct RPATH to the non-MPI libs, but this is an issue with mpicxx that kills LD_RUN_PATH.

You may close this issue. At your discretion, you might want to get in contact with Intel to get this bug investigated that apparently persists up to version 19.0.4.

shiozaki commented 5 years ago

Thanks a lot for figuring this out. Intel products are generally reliable, but there are bugs... And they have never responded to my bug reports in the past. The most notable one is Intel MPI's RMA (MPI_put/get/accumulate) that is leaking memory significantly (you can see it with 3 line code and valgrind). In any event, I will close this issue.