Leeds-MONC / monc

MONC (Leeds fork)
BSD 3-Clause "New" or "Revised" License
5 stars 18 forks source link

Getting MONC working on ARC4 #11

Closed leifdenby closed 3 years ago

leifdenby commented 4 years ago

@MarkUoLeeds and Craig Poku have been working on getting MONC running on ARC4 🌟

Notes copied from email from Mark, issues identified:

  1. Source code error - compiler complaint

    a. Rogue "MPI_Comm" on USE MPI line, remove in 2 fortran source code files monc/components/conditional_diagnostics_whole/src/conditional_diagnostics_whole.F90 monc/components/pdf_analysis/src/pdf_analysis.F90 Specific to a different compiler than GCC

    b. MPI_ANY incorrect in MPI_SYNC in line 268 of io/src/mpicommunication.F90

    ! MRI change this MPI_ANY_SOURCE to status(MPI_SOURCE)
        call mpi_recv(data_buffer, message_size, MPI_BYTE, status(MPI_SOURCE), inter_io_communications(i)%message_tag, &
             io_communicator, MPI_STATUS_IGNORE, ierr)
  2. CONFIGURATION errors a) Several cases MCF are erroneous. For example radiative convective equilibrium RCE_MCS.mcf is missing the diagnostic_files= "path/filename" even though the diagnostics are turned on. straka

    b) The convection files have the wrong naming convention (diag_files instead of diagnostic_files), for example transition/constrain_res1000m.mcf

    c) the job submission script on ARC and Archer is not efficient or "sensible", I can offer an improvement

    d) ARC4 setttings have yet to be finalised and added to the site list. The convention has been broken as should be monc-arc4-gnu and monc-arc4-intel-openmpi and monc-arc4-intel-impi monc-arc4-gnu-mvapich2 [although MVAPICH2 not multithreading)

    e) not resolved final ARC4 config and where build (e.g. the Travis thing)

    f) MeteoVM on ARC4 might be a solution - John Hodrien

cemac-ccs commented 4 years ago

In addition, errors exist which prevent building with newer gnu compiler versions (gcc >= 7), relating to pointer allocation. Some of these issues are solved on the trunk, with changes to model_core/src/components/registry.F90 and model_core/src/components/monc_component.F90, so should be ported over.

These issues do not appear to occur with the gnu/native on arc4, and so this problem is likely of low priority until portability of the code becomes a consideration.

leifdenby commented 4 years ago

In addition, errors exist which prevent building with newer gnu compiler versions (gcc >= 7), relating to pointer allocation. Some of these issues are solved on the trunk, with changes to model_core/src/components/registry.F90 and model_core/src/components/monc_component.F90, so should be ported over.

Thanks @cemac-ccs! Are you currently compiling using fcm or using the makefile in the project root? I'm asking because I think we should write up some instructions on how to compile on ARC4 (as we're planning to do with ARCHER).

cemac-ccs commented 4 years ago

I'm using fcm, with slightly modified fcm-make config files. I think that is a good idea, and I have been keeping a few notes as I go that I can expand into a wiki entry along with adding new env-arc4.cfg etc files in a pull request a bit down the line once I've properly tested it.

leifdenby commented 4 years ago

I'm using fcm, with slightly modified fcm-make config files. I think that is a good idea, and I have been keeping a few notes as I go that I can expand into a wiki entry along with adding new env-arc4.cfg etc files in a pull request a bit down the line once I've properly tested it.

Great! Could you do a draft pull-request already? I just made one with the changes I worked out on ARC4. I haven't tried running MONC yet though. Do you mind having a look at my pull-request and letting me know what you think? https://github.com/Leeds-MONC/monc/pull/19 I think we should use the changes you've made though, but just wanted to show you what I've done 😄

leifdenby commented 4 years ago

@cemac-ccs on https://github.com/Leeds-MONC/monc/issues/20 we're discussing how get compiling (and instructions for it) to a better state. It turns out that it should be possible to compile MONC using the makefile in the project root. Have you tried that? I am also wondering about your thoughts on having two different ways on compiling MONC, would be great to have your thoughts on the issue.

leifdenby commented 4 years ago

@cemac-ccs did you have any luck with MONC on ARC4? It'd be great to get these compilation and run instructions added to the sourcecode :) And what are your thoughts on my question above?

leifdenby commented 4 years ago

@craigpoku would be great to have your thoughts on this :)

cemac-ccs commented 4 years ago

Apologies Leif. I had a project with a close deadline that needed a lot of attention. Is there a particular expected benefit of using the makefile over using fcm? I have tried running the makefile and found getting it to run initially requires no less modification and personalisation to the host machine than compilation using fcm, requiring the run sequence

module purge
module load user
module switch intel gnu
module switch openmpi mvapich2
module load fftw netcdf hdf5
NETCDF_DIR=$NETCDF_HOME
FFTW_DIR=$FFTW_HOME
HDF5_DIR=$FFTW_HOME
make

while the fcm method requires

module purge
module load user
module switch intel gnu
module switch openmpi mvapich2
module load fftw netcdf hdf5 fcm
fcm make -j4 -f fcm-make/monc-arc2-gnu.cfg

The makefile also seems to be flawed in a few ways -

As a result of this third problem, I abandoned compilation with the makefile so there may be other issues in compiling the test cases or the bootstrapper that I have not yet come across

fcm compilation has none of these issues and instead seems to be set up in such a way that it can compile successfully, and have modular changes made to the compilation environment, although these changes have to be called manually by using the correct config file.

With that in mind, I would suggest that the best way to compile on both Archer and on ARC is using fcm unless, as I say above, there is some benefit to using a makefile over using fcm of which I am unaware.

leifdenby commented 3 years ago

I've merged this in this is now resolved. Thanks @cemac-ccs :)