OPM / LBPM

Pore scale modelling
https://lbpm-sim.org/
GNU General Public License v3.0
62 stars 30 forks source link

LBPM_Installation Error with cmake of LBPM BUILD #79

Open EnochMayor opened 1 year ago

EnochMayor commented 1 year ago

Hello Professor McClure,

I have interest in studying Lattice Boltzmann as a numerical modelling tool for my PhD study in Underground carbon storage and in carrying out my research, I came across your paper "The LBPM software package for simulating multiphase flow on digital images of porous rocks" which was very interesting to me in how you performed two phase simulation directly on the digital rocks which is something I want to perform.

I am currently having issues with installing the files on my HPC terminal and this is because I do not have the privilege to install some of the packages requiring the sudo command to install.

`-- The CXX compiler identification is unknown CMake Error at CMakeLists.txt:20 (PROJECT): The CMAKE_CXX_COMPILER:

/opt/openmpi/3.1.2/bin/mpicxx

is not a full path to an existing compiler tool.

Tell CMake where to find the compiler by setting either the environment variable "CXX" or the CMake cache entry CMAKE_CXX_COMPILER to the full path to the compiler, or to the compiler name if it is in the PATH.

-- Configuring incomplete, errors occurred!` This was an error I got

I have also attached the CMakeCache.txt, CMakeError.log and CMakeOutput.log files for reference.

The steps I took involved following the steps outlined in the LBPM_Installion.sh file. I didn't run this as a batch file but followed it sequentially till I got to the point requiring a Sudo install for which I do not have privileges. I also did not install SILO since you specified it as an optional install. However, I have not been successful in installing the LBPM and would like some help.

I connected to the Pawsey group since I guess it was an issue with different configuration for different HPC platforms (I am using the Cedar cluster under the Digital Research Alliance of Canada.

I guess my question is how to resolve this.

I would be glad to get feedback from you. [CMakeCache.txt](https://github.com/OPM/LBPM/files/12146255/CMakeCache.txt)

Warm regards, Enoch

JamesEMcClure commented 1 year ago

This issue is due to the fact that the MPI compiler cannot be found on your system (at least not at the specific path provided)


    /opt/openmpi/3.1.2/bin/mpicxx

  is not a full path to an existing compiler tool.

To see if mpi is installed already on the system where you are working, type the command which mpicxx which will return the path that you should use. Usually it is necessary to modify the path to match what is available on your system.

If MPI is not already available on the system where you are working you will need to install it. Let me know if this is the case and I can provide more information.

EnochMayor commented 1 year ago

Thanks for the feedback Professor McClure,

This was the output I got for the command _which mpicxx_ /cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openmpi/3.1.2/bin/mpicxx

Does that imply that instead of using export MPI_DIR=/opt/openmpi/3.1.2

I would be using export MPI_DIR=/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/gcc7.3/openmpi/3.1.2/bin/mpicxx

EnochMayor commented 1 year ago

Dear professor McClure Thanks for the feedback. I have implemented those corrections

However, I have another issue which is: CMake 3.9 or higher is required. You are running version 3.8.2

from the CMakeLists.txt file

# Set some CMake properties
CMAKE_MINIMUM_REQUIRED( VERSION 3.9 )
if( ${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.20.0")
    CMAKE_POLICY( SET CMP0115 OLD )
endif()

Do I adjust the Cmake properties in the file for my version to compile it? Would changing the version result in compilation error?

Warm Regards Enoch

mcclurej commented 1 year ago

Are you building for CPU or for GPU?

You may get away with version 3.8 for a CPU-build by making the change below. For a GPU build you may need to upgrade to a newer version.

CMAKE_MINIMUM_REQUIRED( VERSION 3.8 )


From: EnochMayor @.> Sent: Monday, July 24, 2023 3:30 PM To: OPM/LBPM @.> Cc: Subscribed @.***> Subject: Re: [OPM/LBPM] LBPM_Installation Error with cmake of LBPM BUILD (Issue #79)

Dear professor McClure Thanks for the feedback. I have implemented those corrections

However, I have another issue which is: CMake 3.9 or higher is required. You are running version 3.8.2

from the CMakeLists.txt file

Set some CMake properties

CMAKE_MINIMUM_REQUIRED( VERSION 3.9 ) if( ${CMAKE_VERSION} VERSION_GREATER_EQUAL "3.20.0") CMAKE_POLICY( SET CMP0115 OLD ) endif()

Do I adjust the Cmake properties in the file for my version to compile it? Would changing the version result in compilation error?

Warm Regards Enoch

— Reply to this email directly, view it on GitHubhttps://github.com/OPM/LBPM/issues/79#issuecomment-1648491046, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHRAT4ZT2REEBLYDKE6QJTXR3EOXANCNFSM6AAAAAA2VSFYJ4. You are receiving this because you are subscribed to this thread.Message ID: @.***>

EnochMayor commented 1 year ago

I was building for CPU and I was able to compile it after loading a more recent version of the cmake module. I do have troubles in running the simulation however.

Running the simulation with lbpm_color_simulator for DiscPack example gives the error:

Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
Unhandled signal (15) caught:

Bytes used = 10392384
Stack Trace:
 [1] 0x000000443c9a:  lbpm_color_simulator                                    _start  start.S:122
   [1] 0x00000044278b:  lbpm_color_simulator                                      main
     [1] 0x0000004ee13f:  lbpm_color_simulator            ScaLBL_ColorModel::ReadInput()
       [1] 0x0000005ab090:  lbpm_color_simulator        Domain::Decomp(std::string const&)
         [1] 0x0000004a441c:  lbpm_color_simulator  void Utilities::MPI::recv<char>(char*, int&, int, bool, int) const
           [1] 0x7ff32458d6a5:          libmpi.so.40                                 PMPI_Recv
             [1] 0x7ff3246849ba:          libmpi.so.40                          mca_pml_ob1_recv
               [1] 0x7ff322775a85:     libopen-pal.so.40                         ompi_sync_wait_mt
                 [1] 0x7ff32276f34c:     libopen-pal.so.40                             opal_progress
                   [1] 0x7ff3245c700e:          libmpi.so.40                 ompi_coll_libnbc_progress
                     [1] 0x000000541f74:  lbpm_color_simulator  StackTrace::terminateFunctionSignal(int)
                       [1] 0x0000005406fd:  lbpm_color_simulator                   StackTrace::backtrace()
 [2] 0x7ff32324216f:             libc.so.6                                     clone
   [2] 0x7ff323ad41f4:       libpthread.so.0
     [1] 0x7ff323242753:             libc.so.6                                epoll_wait
     | [1] 0x000000501de1:  lbpm_color_simulator                                            <artificial>
     [1] 0x7ff32455b637:          libmpi.so.40                            ompi_mpi_abort
       [1] 0x7ff322b8c946:     libopen-rte.so.40                    orte_errmgr_base_abort
         [1] 0x7ff3227ea047:     libopen-pal.so.40                              pmix2x_abort
           [1] 0x7ff32283e2b5:     libopen-pal.so.40                OPAL_MCA_PMIX2X_PMIx_Abort
             [1] 0x7ff323ad9eff:       libpthread.so.0                         pthread_cond_wait
               [1] 0x000000501de1:  lbpm_color_simulator                                            <artificial>
Unhandled signal (15) caught:

Bytes used = 10392128
Stack Trace:
 [1] 0x000000443c9a:  lbpm_color_simulator                                    _start  start.S:122
   [1] 0x00000044278b:  lbpm_color_simulator                                      main
     [1] 0x0000004ee13f:  lbpm_color_simulator            ScaLBL_ColorModel::ReadInput()
       [1] 0x0000005ab090:  lbpm_color_simulator        Domain::Decomp(std::string const&)
         [1] 0x0000004a441c:  lbpm_color_simulator  void Utilities::MPI::recv<char>(char*, int&, int, bool, int) const
           [1] 0x7f654b1256a5:          libmpi.so.40                                 PMPI_Recv
             [1] 0x7f654b21c9ba:          libmpi.so.40                          mca_pml_ob1_recv
               [1] 0x7f654930da85:     libopen-pal.so.40                         ompi_sync_wait_mt
                 [1] 0x7f654930734c:     libopen-pal.so.40                             opal_progress
                   [1] 0x000000541f74:  lbpm_color_simulator  StackTrace::terminateFunctionSignal(int)
                     [1] 0x0000005406fd:  lbpm_color_simulator                   StackTrace::backtrace()
 [2] 0x7f6549dda16f:             libc.so.6                                     clone
   [2] 0x7f654a66c1f4:       libpthread.so.0
     [1] 0x7f6549dda753:             libc.so.6                                epoll_wait
     | [1] 0x000000501de1:  lbpm_color_simulator                                            <artificial>
     [1] 0x7f654b0f3637:          libmpi.so.40                            ompi_mpi_abort
       [1] 0x7f6549724946:     libopen-rte.so.40                    orte_errmgr_base_abort
         [1] 0x7f6549382047:     libopen-pal.so.40                              pmix2x_abort
           [1] 0x7f65493d62b5:     libopen-pal.so.40                OPAL_MCA_PMIX2X_PMIx_Abort
             [1] 0x7f654a671eff:       libpthread.so.0                         pthread_cond_wait
               [1] 0x000000501de1:  lbpm_color_simulator                                            <artificial>
Warning: getGlobalCallStacks called without call to globalCallStackInitialize
Warning: getGlobalCallStacks called without call to globalCallStackInitialize
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node cedar5 exited on signal 6 (Aborted).
EnochMayor commented 1 year ago

Is this an issue with the visualization part of the code. (I did not install SILO since it was optional)...

Regards Enoch

JamesEMcClure commented 1 year ago

Hi Enoch,

It appears that this job failed because signal 15 was sent by some process (i.e. Unhandled signal (15) caught). I am confident that this is not to do with SILO. However, I'm not sure why it has happened. Can you share how you are launching LBPM?

EnochMayor commented 1 year ago

Thanks Prof. McClure, I was able to resolve this. I think it had something to do with my batch script and the number of processes I was running.

Thanks for the help. I am currently trying to replicate simulation discussed in the wiki on digital rock images. After domain decomposition, with an output of 64 case files, I find it laborious to automate reading each as an input for my batch script file

I am also not clear what you mean by this "The parameters for the model are specified by adding a section to th input database, labeled as MRT" (https://github.com/OPM/LBPM/wiki/Simulating-Flow-in-Digital-Rock-Images)

It seems to me that I should write a script to read each file sequentially for each of the case files. I have made some attempt which hasn't been successful.

JamesEMcClure commented 1 year ago

"The parameters for the model are specified by adding a section to th input database, labeled as MRT"

What this comment means is that you can run multiple physical models on the same input image using the same input file. This MRT section refers to the single phase multi-relaxation time (MRT) model used for permeability measurement

https://lbpm-sim.org/userGuide/models/mrt/mrt.html

EnochMayor commented 1 year ago

Thank you Professor, I understand it which leads me to this question;

In the example https://lbpm-sim.org/userGuide/models/mrt/mrt.html

MRT { tau = 1.0 F = 0.0, 0.0, 1.0e-5 timestepMax = 2000 tolerance = 0.01 } Domain { Filename = "Bentheimer_LB_sim_intermediate_oil_wet_Sw_0p37.raw" ReadType = "8bit" // data type N = 900, 900, 1600 // size of original image nproc = 2, 2, 2 // process grid n = 200, 200, 200 // sub-domain size offset = 300, 300, 300 // offset to read sub-domain voxel_length = 1.66 // voxel length (in microns) ReadValues = 0, 1, 2 // labels within the original image WriteValues = 0, 1, 2 // associated labels to be used by LBPM InletLayers = 0, 0, 10 // specify 10 layers along the z-inlet BC = 0 // boundary condition type (0 for periodic) } Visualization { } Here, the MRT model was used to process one file.

By performing domain decomposition with an output several subdomains (ID.00000 .....) , should I take each of these as input into the MRT model.

I tried creating multiple batch scripts to read for each file but that does not seem to help me. Secondly, Is domain decomposition an essential part of the workflow for working with xrct images?

JamesEMcClure commented 1 year ago

The permeability simulator can take the same image as input. The MRT simulator will simply ignore the fluid labels, since there is only one fluid in the porespace.

LBPM no longer requires the ID.xxxxx files (although it can still read them). All of the simulators will now internally perform the domain decomposition if a single image is specified, e.g.

Filename = "Bentheimer_LB_sim_intermediate_oil_wet_Sw_0p37.raw
EnochMayor commented 1 year ago

Thanks Prof this simplifies things ...

I am trying to add my run script, slurm file and input.db file to this space here but nonetheless i have included and attached the file to an email sent.

Primary job  terminated normally, but 1 process returned
a non-zero exit code. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 50 with PID 243879 on node nc20303 exited on signal 9 (Killed).
--------------------------------------------------------------------------
slurmstepd: error: Detected 12 oom-kill event(s) in StepId=19905503.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.
EnochMayor commented 1 year ago

This is the script file I am adapting to run the MRT model. The slurm file is the output. I will try to post it on the github repository

This is the output from the slurm file error I received:

Primary job terminated normally, but 1 process returned

a non-zero exit code. Per user-direction, the job has been aborted.---------------------------------------------------------------------------------------------------------------------------------mpirun noticed that process rank 50 with PID 243879 on node nc20303 exited on signal 9 (Killed).--------------------------------------------------------------------------slurmstepd: error: Detected 12 oom-kill event(s) in StepId=19905503.batch. Some of your processes may have been killed by the cgroup out-of-memory handler.

On Thu, Aug 3, 2023 at 5:28 PM James E. McClure @.***> wrote:

The permeability simulator can take the same image as input. The MRT simulator will simply ignore the fluid labels, since there is only one fluid in the porespace.

LBPM no longer requires the ID.xxxxx files (although it can still read them). All of the simulators will now internally perform the domain decomposition if a single image is specified, e.g.

Filename = "Bentheimer_LB_sim_intermediate_oil_wet_Sw_0p37.raw

— Reply to this email directly, view it on GitHub https://github.com/OPM/LBPM/issues/79#issuecomment-1664668881, or unsubscribe https://github.com/notifications/unsubscribe-auth/A2PA6LF6CBT4GNTFDTO5VWDXTQJWDANCNFSM6AAAAAA2VSFYJ4 . You are receiving this because you authored the thread.Message ID: @.***>