GambitBSM / gambit_2.0

GAMBIT: The Global and Modular Beyond-the-Standard-Model Inference Tool
https://gambit.hepforge.org
3 stars 1 forks source link

make gambit error on a workstation (ws2) #3

Open aseshkdatta opened 3 years ago

aseshkdatta commented 3 years ago

Hi,

I am having an error perhaps related to gcc/boost etc. on CentOS Linux-based workstation.

The tail of the error-message is attached herewith as a screenshot.

Would you kindly look into the matter.

The OS details and build-related commands I used are as follows.

CentOS Linux release 7.6.1810 (Core) Linux ws2 3.10.0-957.12.1.el7.x86_64

The modules I was asked to load by the admin is

module load cmake-3 module load gcc-8.2

and ran 'cmake' in the following way.

CC=/opt/gcc-8.2/bin/gcc CXX=/opt/gcc-8.2/bin/g++ cmake .. -DGSL_CONFIG_EXECUTABLE=/ws2scratch/pkg/gsl_2.6/bin/gsl-config -DWITH_MPI=ON -DWITH_AXEL=ON

Thanks and best regards. Asesh

anderkve commented 3 years ago

Hi Asesh,

Looks like your screenshot didn't come through. Could you try posting it again? Also, it will probably be useful for us to see the cmake output and the content of the file CMakeCache.txt (from the build directory), so perhaps you could attach that as well.

Best, Anders

aseshkdatta commented 3 years ago

Hi Anders,

Thanks for your response.

I am soon trying to post that screenshot on the github page.

In fact, I realized that there had been a problem with attaching the screenshot on the github page. I wrote a personal mail to Tomas informing him about that to which I attached the said screenshot.

Cheers. Asesh

Hi Asesh,

Looks like your screenshot didn't come through. Could you try posting it again? Also, it will probably be useful for us to see the cmake output and the content of the file CMakeCache.txt (from the build directory), so perhaps you could attach that as well.

Best, Anders

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/3#issuecomment-895391022

============================= AseshKrishna Datta Professor 'H' Theoretical High Energy Physics Group Harish-Chandra Research Institute (HRI) (Department of Atomic Energy, Govt. of India) Allahabad (Prayagraj) UP INDIA 211019

aseshkdatta commented 3 years ago

Hi Anders,

Trying to attach the screenshot and it seems it has worked. Please confirm and let me know if it is readable. Thanks.

Asesh screenshot-ws2-1

aseshkdatta commented 3 years ago

Dear Anders,

I just got access to the computer on which the other two files are there. I am now attaching them herewith.

Cheers. Asesh CMakeOutput.log CMakeCache.txt

anderkve commented 3 years ago

Thanks for sending the extra output, Asesh. I haven't seen this error before, but my best guess is that you've encountered a known bug with older versions of Boost: https://stackoverflow.com/a/18900875

According to the stackoverflow post, upgrading to version 1.48 or later should hopefully solve the problem.

The GAMBIT cmake system currently accepts Boost versions all the way back to 1.41 (the version you are using), but based on this problem it looks like we need to increase that requirement to 1.48 or later.

aseshkdatta commented 3 years ago

Thanks a lot, Anders, for your time and the observation. Let me try this out tomorrow. I'll be in touch with you with the outcome.

Cheers. Asesh

Thanks for sending the extra output, Asesh. I haven't seen this error before, but my best guess is that you've encountered a known bug with older versions of Boost: https://stackoverflow.com/a/18900875

According to the stackoverflow post, upgrading to version 1.48 or later should hopefully solve the problem.

The GAMBIT cmake system currently accepts Boost versions all the way back to 1.41 (the version you are using), but based on this problem it looks like we need to increase that requirement to 1.48 or later.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/3#issuecomment-896135267

============================= AseshKrishna Datta Professor 'H' Theoretical High Energy Physics Group Harish-Chandra Research Institute (HRI) (Department of Atomic Energy, Govt. of India) Allahabad (Prayagraj) UP INDIA 211019

aseshkdatta commented 3 years ago

Hi,

The gambit build was rather smooth with boost version 1.76 (latest).

Now I am having issues in building gum.

The environment is the same in which I built gambit. Here it is. The folder under which I am working is

/ws2scratch/asesh/Packages/gambit_2.0-release_2.0/gum/build


[asesh@ws2 build]$module load cmake-3

[asesh@ws2 build]$module load gcc-8.2

[asesh@ws2 build]$source /opt/apps/anaconda3/bin/activate

(base) [asesh@ws2 build]$CC=/opt/gcc-8.2/bin/gcc CXX=/opt/gcc-8.2/bin/g++ cmake .. -DWITH_AXEL=ON -DGSL_DIR=/ws2scratch/pkg/gsl_2.6/ -DGSL_CONFIG_EXECUTABLE=/ws2scratch/pkg/gsl_2.6/bin/gsl-config -DBOOST_ROOT=/opt/apps/boost

The above cmake attempt gives the following error message.

CMake Warning at /opt/cmake-3.13.3-Linux-x86_64/share/cmake-3.13/Modules/FindBoost.cmake:880 (message): New Boost version may have incorrect or missing dependencies and imported targets Call Stack (most recent call first): /opt/cmake-3.13.3-Linux-x86_64/share/cmake-3.13/Modules/FindBoost.cmake:1002 (_Boost_COMPONENT_DEPENDENCIES) /opt/cmake-3.13.3-Linux-x86_64/share/cmake-3.13/Modules/FindBoost.cmake:1670 (_Boost_MISSING_DEPENDENCIES) CMakeLists.txt:201 (find_package)

CMake Error at /opt/cmake-3.13.3-Linux-x86_64/share/cmake-3.13/Modules/FindBoost.cmake:2100 (message): Unable to find the requested Boost libraries.

Boost version: 1.76.0

Boost include path: /opt/apps/boost/include

Could not find the following Boost libraries:

      boost_python37

Some (but not all) of the required Boost libraries were found. You may need to install these additional Boost libraries. Alternatively, set BOOST_LIBRARYDIR to the directory containing Boost libraries or BOOST_ROOT to the location of Boost. Call Stack (most recent call first): CMakeLists.txt:201 (find_package)

Setting GCC flags -- Configuring incomplete, errors occurred! See also "/ws2scratch/asesh/Packages/gambit_2.0-release_2.0/gum/build/CMakeFiles/CMakeOutput.log". See also "/ws2scratch/asesh/Packages/gambit_2.0-release_2.0/gum/build/CMakeFiles/CMakeError.log". (base) [asesh@ws2 build]

CMakeLists.txt:201 (find_package) points to the line with the entry

find_package(Boost 1.41.0 REQUIRED COMPONENTS ${python_component} filesystem system)

Also the boost installation indeed does not have something like "boost_python37.so" in its "lib" folder.

Attaching herewith some of the relevant files.

Please let me know how to proceed from here.

Thanks. Asesh

CMakeLists.txt CMakeCache.txt CMakeOutput.log CMakeError.log

aseshkdatta commented 3 years ago

Hi,

I could solve the boost-related issue by moving to its latest version (v1.76). Thanks for the hint. However, I had to then build boost using python3.7 such that "libboost_python37.so" is created. After these, building gum now works (without MPI).

However, I am having a new problem in enabling MPI. CMake for gambit apparently finds MPI (although not sure if it indeed finds the intended module) but `make -jn scanners' starts giving trouble.

I am attaching herewith the cmake' andmake scanners' terminal outputs showing the issues and some of the CMake ouput files. Could you please look into them and let me know how I could circumvent these issues.

It may be noted that I could build gambit (and gum) successfully when I didn't opt for MPI enabling.

The environmental setup for me includes the following.


module load cmake-3 module load gcc-8.2 module load mpi/openmpi-x86_64 source /opt/apps/anaconda3/bin/activate

The "cmake" options I am using are the following.

(base) [asesh@ws2 build]$CC=/opt/gcc-8.2/bin/gcc CXX=/opt/gcc-8.2/bin/g++ cmake .. -DWITH_MPI=ON -DWITH_AXEL=ON -DGSL_DIR=/ws2scratch/pkg/gsl_2.6/ -DGSL_CONFIG_EXECUTABLE=/ws2scratch/pkg/gsl_2.6/bin/gsl-config -DBOOST_ROOT=/home/asesh/.local/boost_1_76_0/install

I am also not sure why Axel failed during 'make scanners'. I didn't come across this earlier.

Thanks in advance. Asesh

cmake-terminal-output.txt make-scanners-terminal-output.txt CMakeOutput.log CMakeError.log CMakeCache.txt

anderkve commented 3 years ago

Hi Asesh,

Good to hear that the Boost and Python issue was resolved. About the MPI issue:

1) I'm also not sure that cmake is picking up the MPI library you intend. While you load an 'openmpi' module, cmake seems to be picking up an Intel MPI library, based on the paths in this part of the cmake terminal output:

-- Found MPI_C: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigf.so (found version "2.2") 
-- Found MPI_CXX: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigc4.so (found version "2.2") 
-- Found MPI_Fortran: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigf.so (found version "2.2") 

Could it be that you have loaded some Intel modules as well? Running module list might perhaps reveal some inconsistency / potential conflict in the set of modules you have loaded.

2) The compilation error you get when running make -j16 scanners' seems to be coming from the PolyChord scanner. I'm not sure if you intend to run GAMBIT using PolyChord, but if not you can simply leave out this scanner and rather directly build the scanner you're interested in using. E.g.make diver,make multinest`, etc.

Tagging @williamjameshandley (main developer of PolyChord): Will, have you seen the below compilation error before? (Note that the error message suggests a fix -- simply reordering some includes.)

In file included from /opt/ICS_2013/impi/4.1.3.048/intel64/include/mpi.h:1279,
                 from interfaces.hpp:5,
                 from c_interface.cpp:1:
/opt/ICS_2013/impi/4.1.3.048/intel64/include/mpicxx.h:95:2: error: #error "SEEK_SET is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
 #error "SEEK_SET is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
  ^~~~~
/opt/ICS_2013/impi/4.1.3.048/intel64/include/mpicxx.h:99:2: error: #error "SEEK_CUR is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
 #error "SEEK_CUR is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
  ^~~~~
/opt/ICS_2013/impi/4.1.3.048/intel64/include/mpicxx.h:104:2: error: #error "SEEK_END is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
 #error "SEEK_END is #defined but must not be for the C++ binding of MPI. Include mpi.h before stdio.h"
  ^~~~~
make[5]: *** [c_interface.o] Error 1
make[5]: *** Waiting for unfinished jobs....
make[4]: *** [/ws2scratch/asesh/Packages/gambit_2.0/ScannerBit/installed/polychord/1.17.1/lib/libchord.so] Error 2
make[3]: *** [polychord_1.17.1-prefix/src/polychord_1.17.1-stamp/polychord_1.17.1-build] Error 2
make[2]: *** [CMakeFiles/polychord_1.17.1.dir/all] Error 2
aseshkdatta commented 3 years ago

Thanks a lot, Anders, for your observations. I am writing below in reference to the points you noted.

1) I'm also not sure that cmake is picking up the MPI library you intend. While you load an 'openmpi' module, cmake seems to be picking up an Intel MPI library, based on the paths in this part of the cmake terminal output:

-- Found MPI_C: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigf.so
(found version "2.2")
-- Found MPI_CXX: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigc4.so
(found version "2.2")
-- Found MPI_Fortran: /opt/ICS_2013/impi/4.1.3.048/intel64/lib/libmpigf.so
(found version "2.2")

Could it be that you have loaded some Intel modules as well? Running module list might perhaps reveal some inconsistency / potential conflict in the set of modules you have loaded.

==> Runningmodule list shows the following

   Currently Loaded Modulefiles:
      1) cmake-3              2) gcc-8.2              3)

mpi/openmpi-x86_64

i.e., exactly only the modules that I loaded explicitly by hand.

2) The compilation error you get when running make -j16 scanners' seems to be coming from the PolyChord scanner. I'm not sure if you intend to run GAMBIT using PolyChord, but if not you can simply leave out this scanner and rather directly build the scanner you're interested in using. E.g. make diver,make multinest`, etc.

==> Do you mean this is something to do with PolyChord with MPI? Earlier, when I built scanners and gambit without MPI, I faced no issue!

Let's also see what Will observes.

Checking a few other things in light of your observations. Thank you so much.

Asesh

anderkve commented 3 years ago

Hi Asesh,

==> Do you mean this is something to do with PolyChord with MPI? Earlier, when I built scanners and gambit without MPI, I faced no issue!

Exactly. When building with MPI, the macro USE_MPI will be defined, which means that the following lines at the beginning of ScannerBit/installed/polychord/1.17.1/src/polychord/interfaces.hpp

#include <string>
#include <vector>
#ifdef USE_MPI
#include "mpi.h"
#endif

will attempt to include "mpi.h", and that apparently causes the problem (according to the error message).

But again, unless you intend on using the PolyChord scanner in your GAMBIT run, you can just ignore this problem and directly compile the scanner you want to use, i.e. no need to build all the scanners with make scanners.

aseshkdatta commented 3 years ago

Thanks a lot, Anders for your observation.

It seems even when I skip "make -jn scanners" and just ask for "make diver", building gambit faces the similar issue.

I am attaching a part of the screen output to show that. Is the unwarranted Intel MPI playing the spoilsport?

Kindly observe.

gambit-with-only-diver-scanner.txt Cheers. Asesh

anderkve commented 3 years ago

Hi, thanks for the screen output. Indeed, this is clearly a general problem and PolyChord was just the first place it appeared.

Is the unwarranted Intel MPI playing the spoilsport?

Yes, I think that's correct. I've never encountered this problem before, but based on what info I can find elsewhere it seems to be an Intel-related issue. And I guess the reason you see it is probably because your current setup ends up mixing gcc compilers with Intel MPI. So the best approach is probably to help cmake identify the correct MPI library. As discussed in more detail in the cmake documentation, https://cmake.org/cmake/help/v3.3/module/FindMPI.html, there are two ways to do this:

Option 1) Use the cmake flags -DMPI_C_COMPILER, -DMPI_CXX_COMPILER and -DMPI_Fortran_COMPILER to point cmake to the OpenMPI compiler wrappers. Then cmake should try to figure out any other MPI variables automatically

Option 2) Manually set all the flags -DMPI_<lang>_LIBRARIES and -DMPI_<lang>_INCLUDE_PATH to point to the correct paths (where <lang> is C, CXX and Fortran).

Option 1 is probably the easiest. So that would mean adding some cmake flags along the lines of -DMPI_C_COMPILER=/your/path/to/the/openmpi/wrapper/compiler/mpicc, and similar for the CXX (mpicxx) and Fortran (mpifort or mpif90) wrapper compilers. Actually, after you've loaded the openmpi module, it might be enough to simply add -DMPI_C_COMPILER=mpicc -DMPI_CXX_COMPILER=mpicxx -DMPI_Fortran_COMPILER=mpifort (or mpif90), without the full paths.

aseshkdatta commented 3 years ago

Dear Anders,

So many thanks for taking the trouble in going into the details of the issue and for coming up with the possible ways to circumvent the same.

I'll let you know by tomorrow over my attempts following your suggestion.

Cheers. Asesh

Hi, thanks for the screen output. Indeed, this is clearly a general problem and PolyChord was just the first place it appeared.

Is the unwarranted Intel MPI playing the spoilsport?

Yes, I think that's correct. I've never encountered this problem before, but based on what info I can find elsewhere it seems to be an Intel-related issue. And I guess the reason you see it is probably because your current setup ends up mixing gcc compilers with Intel MPI. So the best approach is probably to help cmake identify the correct MPI library. As discussed in more detail in the cmake documentation, https://cmake.org/cmake/help/v3.3/module/FindMPI.html, there are two ways to do this:

Option 1) Use the cmake flags -DMPI_C_COMPILER, -DMPI_CXX_COMPILER and -DMPI_Fortran_COMPILER to point cmake to the OpenMPI compiler wrappers. Then cmake should try to figure out any other MPI variables automatically

Option 2) Manually set all the flags -DMPI_<lang>_LIBRARIES and -DMPI_<lang>_INCLUDE_PATH to point to the correct paths (where <lang> is C, CXX and Fortran).

Option 1 is probably the easiest. So that would mean adding some cmake flags along the lines of -DMPI_C_COMPILER=/your/path/to/the/openmpi/wrapper/compiler/mpicc, and similar for the CXX (mpicxx) and Fortran (mpifort or mpif90) wrapper compilers. Actually, after you've loaded the openmpi module, it might be enough to simply add -DMPI_C_COMPILER=mpicc -DMPI_CXX_COMPILER=mpicxx -DMPI_Fortran_COMPILER=mpifort (or mpif90), without the full paths.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/GambitBSM/gambit_2.0/issues/3#issuecomment-902233239

============================= AseshKrishna Datta Professor 'H' Theoretical High Energy Physics Group Harish-Chandra Research Institute (HRI) (Department of Atomic Energy, Govt. of India) Allahabad (Prayagraj) UP INDIA 211019

aseshkdatta commented 3 years ago

Hi,

The options

"-DMPI_C_COMPILER=mpicc -DMPI_CXX_COMPILER=mpicxx -DMPI_Fortran_COMPILER=mpif90"

worked with one particular openmpi module out of 3 of them that are available on the system. So Gambit build was successful and I relent for the time being:) Thank you so much.

However, plotting with "pippi" ("pippi MDMSM.pip") is showing the following error which I am not yet able to fix. Could you please observe.

...................................................... Writing scripts for 2D plots of quantities [4, 200] Writing scripts for 2D plots of quantities [4, 201]

Running plotting scripts for 1D plots of quantity  1

/ws2scratch/asesh/Packages/ruby-install/bin/ruby: symbol lookup error: /ws2scratch/asesh/Packages/ruby-install/lib/ruby/gems/3.0.0/gems/tioga-1.19.1/lib/Dobjects/Dvector.so: undefined symbol: rb_safe_level

Running pippi failed in plot operation. Command 'cd results/MDMSM/scripts/; ./MDMSM_1_like1D.bsh' returned non-zero exit status 127

Asesh