plumed / plumed2

Development version of plumed 2
https://www.plumed.org
GNU Lesser General Public License v3.0
323 stars 269 forks source link

Unable to compile Libtorch with Plumed #1077

Open Esenmira opened 1 month ago

Esenmira commented 1 month ago

Hello, I am trying to use Plumed 2.9.0 with Pytorch/Libtorch (CPU version) in the perspective of using it with CPMD to run enhanced sampling MD using machine-learned collective variables (generated by mlcolvars using Pytorch). If I try to run ./configure, I have different issues depending on the choice of the C++ compiler. I have tried both 2.0.0 and 2.3.0 versions of Pytorch/Libtorch, with or without C++11 ABIs, to no avail. The machine on which I am trying to install this runs under Rocky Linux release 9.2 (Blue Onyx).

I was also trying to install Plumed, Libtorch and Pytorch using conda (which works) but I understand Plumed installed in this way is already compiled and cannot be set up with optional modules? So compiling is the only way to have optional modules?

I have enabled the environment variables as indicated in the installation guide. Here is the ./configure command that I run (based on recommendations in earlier issues):

./configure --prefix=/home/ac276447/mimic_sources/plumed-2.9.0/plumed-install/ --enable-libtorch LDFLAGS="-L/home/ac276447/mimic_sources/other-libtorch/libtorch-2.3.0-cxx11/lib"  CPPFLAGS='-I/home/ac276447/mimic_sources/other-libtorch/libtorch-2.3.0-cxx11/include -I/home/ac276447/mimic_sources/other-libtorch/libtorch-2.3.0-cxx11/include/torch/csrc/api/include/'

Adding the --enable-modules=pytorch option does not change anything.

The following commands have all been run with Libtorch 2.3.0, C++11 ABI, CPU version.

With default (mpiicpc) and mpiicc:

This is set up as the CXX environment variable, because it is what I found to work with other programs. Setting it explicitly in ./configure does not change anything. The following lines appear when running ./configure but are skipped quickly:

checking whether mpiicpc accepts -std=c++14... yes
checking libtorch without extra libs... no
checking libtorch with  -ltorch_cpu -lc10... no
configure: WARNING: cannot enable __PLUMED_HAS_LIBTORCH

plus a bunch of errors looking like this:

In file included from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/cwchar(44),
                 from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/bits/postypes.h(40),
                 from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/iosfwd(42),
                 from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/ios(40),
                 from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/ostream(40),
                 from /usr/local/install/gcc-13.1.0/include/c++/13.1.0/iostream(41),
                 from conftest.cpp(1):
/usr/include/wchar.h(397): error: identifier "_Float32" is undefined
  extern _Float32 wcstof32 (const wchar_t *__restrict __nptr,

I am guessing that Plumed tries to use gcc at some point (?).

There is also this error when running it with Libtorch 2.3.0:

/home/ac276447/mimic_sources/other-libtorch/libtorch-2.3.0-cxx11/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch. 

Invoking the compiler with -v gives:

mpiicc for the Intel(R) MPI Library 2021.10 for Linux*
Copyright Intel Corporation.
icc: remark #10441: The Intel(R) C++ Compiler Classic (ICC) is deprecated and will be removed from product release in the second half of 2023. The Intel(R) oneAPI DPC++/C++ Compiler (ICX) is the recommended compiler moving forward. Please transition to use this compiler. Use '-diag-disable=10441' to disable this message.
icc version 2021.10.0 (gcc version 13.1.0 compatibility)

With gcc

When trying to set CXX=gcc: the "checking libtorch" lines have the same output but are much longer to pass (around 10s for each), the gcc-related errors disappear and the config.log file is much bigger (32MB compared to 650kB). Of course Plumed says that it will not be configured with MPI.

gcc -v gives:

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/local/install/gcc-13.1.0/libexec/gcc/x86_64-pc-linux-gnu/13.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/usr/local/install/gcc-13.1.0 --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.1.0 (GCC) 

With mpic++, mpiCC, mpicxx, mpigxx, mpicc, mpigcc

Same thing as above, except the MPI configuration message.

Invoking them with -v gives (mpic++, mpiCC):

Using built-in specs.
COLLECT_GCC=/usr/local/install/gcc-13.1.0/bin/g++
COLLECT_LTO_WRAPPER=/usr/local/install/gcc-13.1.0/libexec/gcc/x86_64-pc-linux-gnu/13.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/usr/local/install/gcc-13.1.0 --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.1.0 (GCC) 

Or this (mpicxx, mpigxx):

mpigxx for the Intel(R) MPI Library 2021.10 for Linux*
Copyright Intel Corporation.
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/local/install/gcc-13.1.0/libexec/gcc/x86_64-pc-linux-gnu/13.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/usr/local/install/gcc-13.1.0 --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.1.0 (GCC)

Or this (mpicc, mpigcc):

mpigcc for the Intel(R) MPI Library 2021.10 for Linux*
Copyright Intel Corporation.
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/local/install/gcc-13.1.0/libexec/gcc/x86_64-pc-linux-gnu/13.1.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../configure --prefix=/usr/local/install/gcc-13.1.0 --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 13.1.0 (GCC)

With mpiicx and mpiicpx

Same output as above, the "checking libtorch" lines are passed much more quickly but the config.log file is much smaller only 225kB now.

I believe these compilers do not work at all for me as invoking them with -v gives:

Intel(R) oneAPI DPC++/C++ Compiler 2023.2.0 (2023.2.0.20230622)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/bin-llvm
Configuration file: /usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/bin-llvm/../bin/icpx.cfg
Found candidate GCC installation: /usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0
Selected GCC installation: /usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "/usr/bin/ld" --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o a.out /lib/../lib64/crt1.o /lib/../lib64/crti.o /usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0/crtbegin.o -L/usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0/lib/release -L/usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0/lib -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/bin-llvm/../lib -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin -L/usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0 -L/usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0/../../.. -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/bin-llvm/../lib -L/lib -L/usr/lib -L/usr/local/install/intel/intel-hpc-2023.2.0/tbb/2021.10.0/env/../lib/intel64/gcc4.8 -L/usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0//libfabric/lib -L/usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0//lib/release -L/usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0//lib -L/usr/local/install/intel/intel-hpc-2023.2.0/mkl/2023.2.0/lib/intel64 -L/usr/local/install/intel/intel-hpc-2023.2.0/ippcp/2021.8.0/lib/intel64 -L/usr/local/install/intel/intel-hpc-2023.2.0/ipp/2021.9.0/lib/intel64 -L/usr/local/install/intel/intel-hpc-2023.2.0/dnnl/2023.2.0/cpu_dpcpp_gpu_dpcpp/lib -L/usr/local/install/intel/intel-hpc-2023.2.0/dal/2023.2.0/lib/intel64 -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/compiler/lib/intel64_lin -L/usr/local/install/intel/intel-hpc-2023.2.0/compiler/2023.2.0/linux/lib -L/usr/local/install/intel/intel-hpc-2023.2.0/ccl/2021.10.0/lib/cpu_gpu_dpcpp -L/home/ac276447/mimic_sources/other-libtorch/libtorch-2.0.0-precxx11/lib -L/home/ac276447/.local/lib -L. --enable-new-dtags -rpath /usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0/lib/release -rpath /usr/local/install/intel/intel-hpc-2023.2.0/mpi/2021.10.0/lib -lmpicxx -lmpifort -lmpi -ldl -lrt -lpthread -Bstatic -lsvml -Bdynamic -Bstatic -lirng -Bdynamic -lstdc++ -Bstatic -limf -Bdynamic -lm -lgcc_s -lgcc -Bstatic -lirc -Bdynamic -ldl -lgcc_s -lgcc -lc -lgcc_s -lgcc -Bstatic -lirc_s -Bdynamic /usr/local/install/gcc-13.1.0/lib/gcc/x86_64-pc-linux-gnu/13.1.0/crtend.o /lib/../lib64/crtn.o
/usr/bin/ld: /lib/../lib64/crt1.o: in function `_start':
(.text+0x1b): undefined reference to `main'
icpx: error: linker command failed with exit code 1 (use -v to see invocation)

Other potentially concerning errors that I systematically see in the ./configure output are these (just before the checking libtorch lines):

checking fftw3.h usability... no
checking fftw3.h presence... no
checking for fftw3.h... no
configure: WARNING: cannot enable __PLUMED_HAS_FFTW
checking for python... python
configure: Python executable is python
checking support for required python modules (python3, setuptools, cython)... no
configure: WARNING: cannot enable python interface

Does anyone have ideas about how to solve this? Is this due to a missing option, a wrong version of the compiler? Please let me know what further tests or details I could do to solve this.

Attached: the config.log files obtained with mpic++, mpiicx, and mpiicc with Libtorch 2.3.0 CPU version. plumed_conf_issues.zip