ACCESS-NRI / ACCESS-OM2

ACCESS-OM2: ACCESS Ocean-Sea Ice Model
Apache License 2.0
5 stars 0 forks source link

Develop Spack build infrastructure #6

Closed aidanheerdegen closed 1 year ago

aidanheerdegen commented 1 year ago

Develop spack based build infrastructure.

aidanheerdegen commented 1 year ago

Edited: 16-09-22

~The goal is to reproduce the ACCESS-OM2 current build answers, so this will require https://github.com/ACCESS-NRI/reproducibility/issues/3 to achieve~

Scratch above. Should use the existing ACCESS-OM2 reproducibility test, which is run weekly on Jenkins

https://accessdev.nci.org.au/jenkins/job/ACCESS-OM2/job/reproducibility/

It clones https://github.com/COSIMA/access-om2.git and runs:

module use /g/data/hh5/public/modules && module load conda/analysis3-unstable && python -m pytest -s test/test_bit_reproducibility.py

The test is in the repository

https://github.com/COSIMA/access-om2/blob/master/test/test_bit_reproducibility.py

Specifically this is the part that opens the existing log file, pulls out the checksums and compares them to the checksums just produced

https://github.com/COSIMA/access-om2/blob/43568e56f4a043075f5f07efaeefbca9a444406f/test/test_bit_reproducibility.py#L89-L91

This is the truth output

https://github.com/COSIMA/access-om2/blob/master/test/checksums/1deg_jra55_iaf-access-om2.out

harshula commented 1 year ago

Reminder: https://cmake.org/cmake/help/latest/module/CPack.html

harshula commented 1 year ago

Reminder: https://github.com/fortran-lang/fpm

harshula commented 1 year ago

Discussion - If we do NOT use shared libraries, what's the practical difference between using Spack Vs CMake's FetchContent

aidanheerdegen commented 1 year ago

We need to support methods to easily locate and use ACCESS-NRI developed software, including models.

https://github.com/ACCESS-NRI/model_builder/issues/3

Spack has sophisticated support for modulefiles, and has it's own environment system.

harshula commented 1 year ago

Notes

https://spack.readthedocs.io/en/latest/repositories.html

Spack comes with thousands of built-in package recipes in var/spack/repos/builtin/. This is a package repository – a directory that Spack searches when it needs to find a package by name. You may need to maintain packages for restricted, proprietary or experimental software separately from the built-in repository. Spack allows you to configure local repositories using either the repos.yaml or the spack repo command.

harshula commented 1 year ago

Notes

https://spack.readthedocs.io/en/latest/packaging_guide.html

While the built-in build systems should meet your needs for the vast majority of packages, some packages provide custom build scripts. This guide is intended for the following use cases:

  • Packaging software with its own custom build system
  • Adding support for new build systems
harshula commented 1 year ago

Notes

https://spack.readthedocs.io/en/latest/build_settings.html#package-requirements

You can provide a more-relaxed constraint and allow the concretizer to choose between a set of options using any_of or one_of:

  • any_of is a list of specs. One of those specs must be satisfied and it is also allowed for the concretized spec to match more than one. In the above example, that means you could build openmpi+cuda%gcc, openmpi\~cuda%clang or openmpi~cuda%gcc (in the last case, note that both specs in the any_of for openmpi are satisfied).
  • one_of is also a list of specs, and the final concretized spec must match exactly one of them. In the above example, that means you could build mpich+cuda or mpich+rocm but not mpich+cuda+rocm (note the current package definition for mpich already includes a conflict, so this is redundant but still demonstrates the concept).
harshula commented 1 year ago

Notes

https://spack.readthedocs.io/en/latest/build_settings.html#package-preferences

packages:
  opencv:
    compiler: [gcc@4.9]
    variants: +debug
  gperftools:
    version: [2.2, 2.4, 2.3]
  all:
    compiler: [gcc@4.4.7, 'gcc@4.6:', intel, clang, pgi]
    target: [sandybridge]
    providers:
      mpi: [mvapich2, mpich, openmpi]
harshula commented 1 year ago

Notes

https://discourse.cmake.org/t/how-to-generate-pc-pkg-config-file-supporting-prefix-of-the-cmake-install/4109

The aforementioned forum thread might be misleading.

I followed the 2 step process (first build, then install):

configure_file(libaccessom2.pc.in lib/pkgconfig/libaccessom2.pc @ONLY)
install(FILES ${CMAKE_BINARY_DIR}/lib/pkgconfig/libaccessom2.pc DESTINATION lib/pkgconfig)
harshula commented 1 year ago

Notes

https://cmake.org/cmake/help/latest/command/add_library.html#object-libraries

harshula commented 1 year ago

Notes

If we want to statically link Parallelio, then Spack's Parallelio package file needs: define("BUILD_SHARED_LIBS", False)

harshula commented 1 year ago

Notes

How to create a variant:

diff --git a/var/spack/repos/builtin/packages/parallelio/package.py b/var/spack/repos/builtin/packages/parallelio/package.py
index 91c2e340d5..b01005b39d 100644
--- a/var/spack/repos/builtin/packages/parallelio/package.py
+++ b/var/spack/repos/builtin/packages/parallelio/package.py
@@ -29,6 +29,7 @@ class Parallelio(CMakePackage):
     variant(
         "fortran", default=True, description="enable fortran interface (requires netcdf fortran)"
     )
+    variant("shared", default=True, description="enable shared library")

     depends_on("mpi")
     depends_on("netcdf-c +mpi", type="link")

@@ -51,7 +52,6 @@ def cmake_args(self):
             define("NetCDF_C_PATH", spec["netcdf-c"].prefix),
             define("USER_CMAKE_MODULE_PATH", join_path(src, "cmake")),
             define("GENF90_PATH", join_path(src, "genf90")),
-            define("BUILD_SHARED_LIBS", True),
             define("PIO_ENABLE_EXAMPLES", False),
         ]
         if spec.satisfies("+pnetcdf"):
@@ -72,6 +72,7 @@ def cmake_args(self):
                 define_from_variant("PIO_ENABLE_TIMING", "timing"),
                 define_from_variant("PIO_ENABLE_LOGGING", "logging"),
                 define_from_variant("PIO_ENABLE_FORTRAN", "fortran"),
+                define_from_variant("BUILD_SHARED_LIBS", "shared"),
             ]
         )
         return args

How to depend on a variant: depends_on("parallelio~pnetcdf~timing~shared")

harshula commented 1 year ago

Notes

lib/spack/spack/spec.py:

def _libs_default_handler(descriptor, spec, cls):
    """Default handler when looking for the 'libs' attribute.

    Tries to search for ``lib{spec.name}`` recursively starting from
    ``spec.package.home``. If ``spec.name`` starts with ``lib``, searches for
    ``{spec.name}`` instead.
...
    # Variable 'name' is passed to function 'find_libraries', which supports
    # glob characters. For example, we have a package with a name 'abc-abc'.
    # Now, we don't know if the original name of the package is 'abc_abc'
    # (and it generates a library 'libabc_abc.so') or 'abc-abc' (and it
    # generates a library 'libabc-abc.so'). So, we tell the function
    # 'find_libraries' to give us anything that matches 'libabc?abc' and it
    # gives us either 'libabc-abc.so' or 'libabc_abc.so' (or an error)
    # depending on which one exists (there is a possibility, of course, to
    # get something like 'libabcXabc.so, but for now we consider this
    # unlikely).    
aidanheerdegen commented 1 year ago

If we want to statically link Parallelio

I realised I was incorrect when I stated (IRL) that libraries were statically linked in ACCESS-OM2. The libraries that were compiled were statically linked, but other dependencies like netcdf and OpenMPI are dynamically linked (see below for details).

Now we're building most dependencies with spack, so it isn't clear if they should be statically or dynamically linked.

We could do it on a case-by-case basis, e.g. it is desirable to have OpenMPI dynamically linked, as it allows for, say, a container built on a GitHub runner with the correct base OS to also run on gadi and swap in the gadi OpenMPI library by altering the appropriate environment variables for the container.

Probably better is to have the same policy as before, to statically link those libraries for which we have source code responsibility, e.g. mom5, cice5, yatm and libaccessom2, and dynamically link the rest. This represents a change for PIO I believe, but I could be wrong. This is fine I think, to make it consistent with how the underlying libraries (netcdf-c, netcdf-fortran) are also linked.

Edit: inserted blank line to make code snippet render correctly.

``` $ ldd /g/data/ik11/inputs/access-om2/bin/fms_ACCESS-OM_730f0bf_libaccessom2_d750b4b.x linux-vdso.so.1 (0x00007ffe68f26000) libnetcdf.so.18 => /apps/netcdf/4.7.4/lib/libnetcdf.so.18 (0x00001553349b4000) libnetcdff.so.7 => /apps/netcdf/4.7.4/lib/Intel/libnetcdff.so.7 (0x00001553343ec000) libmpi_usempif08_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempif08_Intel.so.40 (0x0000155334166000) libmpi_usempi_ignore_tkr_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempi_ignore_tkr_Intel.so.40 (0x0000155333f4d000) libmpi_mpifh_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_mpifh_Intel.so.40 (0x0000155333cb5000) libmpi.so.40 => /apps/openmpi/4.0.2/lib/libmpi.so.40 (0x0000155333981000) libm.so.6 => /lib64/libm.so.6 (0x00001553335ff000) libpthread.so.0 => /lib64/libpthread.so.0 (0x00001553333df000) libc.so.6 => /lib64/libc.so.6 (0x000015533301a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000155332e02000) libdl.so.2 => /lib64/libdl.so.2 (0x0000155332bfe000) libmfhdf.so.0 => /apps/hdf4/4.2.14/lib/libmfhdf.so.0 (0x00001553329d3000) libdf.so.0 => /apps/hdf4/4.2.14/lib/libdf.so.0 (0x0000155332721000) libsz.so.2 => /apps/szip/2.1.1/lib/libsz.so.2 (0x000015533250d000) libtirpc.so.3 => /lib64/libtirpc.so.3 (0x00001553322da000) libjpeg.so.62 => /lib64/libjpeg.so.62 (0x0000155332071000) libhdf5_hl.so.100 => /apps/hdf5/1.10.5/lib/libhdf5_hl.so.100 (0x0000155331e4e000) libhdf5.so.103 => /apps/hdf5/1.10.5/lib/libhdf5.so.103 (0x0000155331880000) libz.so.1 => /lib64/libz.so.1 (0x0000155331668000) libcurl.so.4 => /lib64/libcurl.so.4 (0x00001553313da000) libifport.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifport.so.5 (0x00001553311ac000) libifcoremt.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifcoremt.so.5 (0x0000155330e17000) libimf.so => /apps/intel-ct/2019.3.199/compiler/lib/libimf.so (0x0000155330877000) libsvml.so => /apps/intel-ct/2019.3.199/compiler/lib/libsvml.so (0x000015532eed3000) libintlc.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libintlc.so.5 (0x000015532ec61000) libopen-rte.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-rte.so.40 (0x000015532e9a2000) libopen-pal.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-pal.so.40 (0x000015532e6cd000) librt.so.1 => /lib64/librt.so.1 (0x000015532e4c5000) libutil.so.1 => /lib64/libutil.so.1 (0x000015532e2c1000) libhwloc.so.15 => /lib64/libhwloc.so.15 (0x000015532e071000) /lib64/ld-linux-x86-64.so.2 (0x0000155334d37000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x000015532de1c000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x000015532db32000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x000015532d91b000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x000015532d717000) libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x000015532d4f0000) libidn2.so.0 => /lib64/libidn2.so.0 (0x000015532d2d2000) libssh.so.4 => /lib64/libssh.so.4 (0x000015532d063000) libpsl.so.5 => /lib64/libpsl.so.5 (0x000015532ce52000) libssl.so.1.1 => /lib64/libssl.so.1.1 (0x000015532cbbe000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x000015532c6d5000) libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x000015532c486000) liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x000015532c276000) libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x000015532c069000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x000015532be58000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x000015532bc54000) libresolv.so.2 => /lib64/libresolv.so.2 (0x000015532ba3d000) libunistring.so.2 => /lib64/libunistring.so.2 (0x000015532b6bc000) libsasl2.so.3 => /lib64/libsasl2.so.3 (0x000015532b49e000) libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x000015532b27d000) libselinux.so.1 => /lib64/libselinux.so.1 (0x000015532b053000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x000015532ae2a000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x000015532aba6000) ``` ``` $ ldd /g/data/ik11/inputs/access-om2/bin/cice_auscom_360x300_24p_edcfa6f_libaccessom2_d750b4b.exe linux-vdso.so.1 (0x00007ffd68bab000) libmpi.so.40 => /apps/openmpi/4.0.2/lib/libmpi.so.40 (0x0000146f99c3f000) libnetcdf_ompi3.so.18 => /apps/netcdf/4.7.4p/lib/libnetcdf_ompi3.so.18 (0x0000146f998af000) libnetcdff_ompi3_Intel.so.7 => /apps/netcdf/4.7.4p/lib/libnetcdff_ompi3_Intel.so.7 (0x0000146f992dc000) libmpi_usempif08_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempif08_Intel.so.40 (0x0000146f99056000) libmpi_usempi_ignore_tkr_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempi_ignore_tkr_Intel.so.40 (0x0000146f98e3d000) libmpi_mpifh_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_mpifh_Intel.so.40 (0x0000146f98ba5000) libm.so.6 => /lib64/libm.so.6 (0x0000146f98823000) libpthread.so.0 => /lib64/libpthread.so.0 (0x0000146f98603000) libdl.so.2 => /lib64/libdl.so.2 (0x0000146f983ff000) libc.so.6 => /lib64/libc.so.6 (0x0000146f9803a000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000146f97e22000) libopen-rte.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-rte.so.40 (0x0000146f97b63000) libopen-pal.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-pal.so.40 (0x0000146f9788e000) librt.so.1 => /lib64/librt.so.1 (0x0000146f97686000) libutil.so.1 => /lib64/libutil.so.1 (0x0000146f97482000) libz.so.1 => /lib64/libz.so.1 (0x0000146f9726a000) libhwloc.so.15 => /lib64/libhwloc.so.15 (0x0000146f9701a000) libmfhdf.so.0 => /apps/hdf4/4.2.14/lib/libmfhdf.so.0 (0x0000146f96def000) libdf.so.0 => /apps/hdf4/4.2.14/lib/libdf.so.0 (0x0000146f96b3d000) libtirpc.so.3 => /lib64/libtirpc.so.3 (0x0000146f9690a000) libjpeg.so.62 => /lib64/libjpeg.so.62 (0x0000146f966a1000) libhdf5_hl_ompi3.so.100 => /apps/hdf5/1.10.5p/lib/libhdf5_hl_ompi3.so.100 (0x0000146f96479000) libhdf5_ompi3.so.103 => /apps/hdf5/1.10.5p/lib/libhdf5_ompi3.so.103 (0x0000146f95e67000) libsz.so.2 => /apps/szip/2.1.1/lib/libsz.so.2 (0x0000146f95c53000) libcurl.so.4 => /lib64/libcurl.so.4 (0x0000146f959c5000) libifport.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifport.so.5 (0x0000146f95797000) libifcoremt.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifcoremt.so.5 (0x0000146f95402000) libimf.so => /apps/intel-ct/2019.3.199/compiler/lib/libimf.so (0x0000146f94e62000) libsvml.so => /apps/intel-ct/2019.3.199/compiler/lib/libsvml.so (0x0000146f934be000) libintlc.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libintlc.so.5 (0x0000146f9324c000) /lib64/ld-linux-x86-64.so.2 (0x0000146f99f73000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x0000146f92ff7000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x0000146f92d0d000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x0000146f92af6000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x0000146f928f2000) libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x0000146f926cb000) libidn2.so.0 => /lib64/libidn2.so.0 (0x0000146f924ad000) libssh.so.4 => /lib64/libssh.so.4 (0x0000146f9223e000) libpsl.so.5 => /lib64/libpsl.so.5 (0x0000146f9202d000) libssl.so.1.1 => /lib64/libssl.so.1.1 (0x0000146f91d99000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x0000146f918b0000) libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x0000146f91661000) liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x0000146f91451000) libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x0000146f91244000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x0000146f91033000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x0000146f90e2f000) libresolv.so.2 => /lib64/libresolv.so.2 (0x0000146f90c18000) libunistring.so.2 => /lib64/libunistring.so.2 (0x0000146f90897000) libsasl2.so.3 => /lib64/libsasl2.so.3 (0x0000146f90679000) libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x0000146f90458000) libselinux.so.1 => /lib64/libselinux.so.1 (0x0000146f9022e000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x0000146f90005000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x0000146f8fd81000) ``` ``` $ ldd /g/data/ik11/inputs/access-om2/bin/yatm_d750b4b.exe linux-vdso.so.1 (0x00007ffce930e000) libnetcdff.so.7 => /apps/netcdf/4.7.4/lib/Intel/libnetcdff.so.7 (0x000014e642f86000) libmpi_usempif08_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempif08_Intel.so.40 (0x000014e642d00000) libmpi_usempi_ignore_tkr_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_usempi_ignore_tkr_Intel.so.40 (0x000014e642ae7000) libmpi_mpifh_Intel.so.40 => /apps/openmpi/4.0.2/lib/libmpi_mpifh_Intel.so.40 (0x000014e64284f000) libmpi.so.40 => /apps/openmpi/4.0.2/lib/libmpi.so.40 (0x000014e64251b000) libm.so.6 => /lib64/libm.so.6 (0x000014e642199000) libpthread.so.0 => /lib64/libpthread.so.0 (0x000014e641f79000) libdl.so.2 => /lib64/libdl.so.2 (0x000014e641d75000) libc.so.6 => /lib64/libc.so.6 (0x000014e6419b0000) libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x000014e641798000) libmfhdf.so.0 => /apps/hdf4/4.2.14/lib/libmfhdf.so.0 (0x000014e64156d000) libdf.so.0 => /apps/hdf4/4.2.14/lib/libdf.so.0 (0x000014e6412bb000) libsz.so.2 => /apps/szip/2.1.1/lib/libsz.so.2 (0x000014e6410a7000) libtirpc.so.3 => /lib64/libtirpc.so.3 (0x000014e640e74000) libjpeg.so.62 => /lib64/libjpeg.so.62 (0x000014e640c0b000) libhdf5_hl.so.100 => /apps/hdf5/1.10.5/lib/libhdf5_hl.so.100 (0x000014e6409e8000) libhdf5.so.103 => /apps/hdf5/1.10.5/lib/libhdf5.so.103 (0x000014e64041a000) libz.so.1 => /lib64/libz.so.1 (0x000014e640202000) libcurl.so.4 => /lib64/libcurl.so.4 (0x000014e63ff74000) libnetcdf.so.18 => /apps/netcdf/4.7.4/lib/libnetcdf.so.18 (0x000014e63fbf1000) libifport.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifport.so.5 (0x000014e63f9c3000) libifcoremt.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libifcoremt.so.5 (0x000014e63f62e000) libimf.so => /apps/intel-ct/2019.3.199/compiler/lib/libimf.so (0x000014e63f08e000) libsvml.so => /apps/intel-ct/2019.3.199/compiler/lib/libsvml.so (0x000014e63d6ea000) libintlc.so.5 => /apps/intel-ct/2019.3.199/compiler/lib/libintlc.so.5 (0x000014e63d478000) libopen-rte.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-rte.so.40 (0x000014e63d1b9000) libopen-pal.so.40 => /apps/openmpi-mofed5.5-pbs2021.1/4.0.2/lib/libopen-pal.so.40 (0x000014e63cee4000) librt.so.1 => /lib64/librt.so.1 (0x000014e63ccdc000) libutil.so.1 => /lib64/libutil.so.1 (0x000014e63cad8000) libhwloc.so.15 => /lib64/libhwloc.so.15 (0x000014e63c888000) /lib64/ld-linux-x86-64.so.2 (0x000014e64354e000) libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x000014e63c633000) libkrb5.so.3 => /lib64/libkrb5.so.3 (0x000014e63c349000) libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x000014e63c132000) libcom_err.so.2 => /lib64/libcom_err.so.2 (0x000014e63bf2e000) libnghttp2.so.14 => /lib64/libnghttp2.so.14 (0x000014e63bd07000) libidn2.so.0 => /lib64/libidn2.so.0 (0x000014e63bae9000) libssh.so.4 => /lib64/libssh.so.4 (0x000014e63b87a000) libpsl.so.5 => /lib64/libpsl.so.5 (0x000014e63b669000) libssl.so.1.1 => /lib64/libssl.so.1.1 (0x000014e63b3d5000) libcrypto.so.1.1 => /lib64/libcrypto.so.1.1 (0x000014e63aeec000) libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x000014e63ac9d000) liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x000014e63aa8d000) libbrotlidec.so.1 => /lib64/libbrotlidec.so.1 (0x000014e63a880000) libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x000014e63a66f000) libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x000014e63a46b000) libresolv.so.2 => /lib64/libresolv.so.2 (0x000014e63a254000) libunistring.so.2 => /lib64/libunistring.so.2 (0x000014e639ed3000) libsasl2.so.3 => /lib64/libsasl2.so.3 (0x000014e639cb5000) libbrotlicommon.so.1 => /lib64/libbrotlicommon.so.1 (0x000014e639a94000) libselinux.so.1 => /lib64/libselinux.so.1 (0x000014e63986a000) libcrypt.so.1 => /lib64/libcrypt.so.1 (0x000014e639641000) libpcre2-8.so.0 => /lib64/libpcre2-8.so.0 (0x000014e6393bd000) ```
harshula commented 1 year ago

Probably better is to have the same policy as before, to statically link those libraries for which we have source code responsibility, e.g. mom5, cice5, yatm and libaccessom2, and dynamically link the rest. This represents a change for PIO I believe, but I could be wrong. This is fine I think, to make it consistent with how the underlying libraries (netcdf-c, netcdf-fortran) are also linked.

Now that I know how to switch between the two types of dependencies in Spack, we can do either option at build time. However, the implications at runtime may need some attention.

harshula commented 1 year ago

Notes

When calling find_libraries(), don't include the ".a" or ".so" of the filename:

lib/spack/llnl/util/filesystem.py:

def find_libraries(libraries, root, shared=True, recursive=False, runtime=True):
...
    # List of libraries we are searching with suffixes
    libraries = ["{0}.{1}".format(lib, suffix) for lib in libraries for suffix in suffixes]

e.g.

    # https://spack-tutorial.readthedocs.io/en/ecp21/tutorial_advanced_packaging.html
    @property
    def libs(self):
        libraries = ["libmct", "libmpeu", "libpsmile.MPI1", "libscrip"]
        shared = False
        return find_libraries(
            libraries, root=self.prefix, shared=shared, recursive=True
        )
harshula commented 1 year ago

Notes

Software model for "compiler as dependencies" #31357

aidanheerdegen commented 1 year ago

One method to package and deploy ACCESS-OM is to make a simple bundle spack package

https://spack.readthedocs.io/en/latest/build_systems/bundlepackage.html

Then the this repo could contain a spack.yaml file defining an environment with a specific version of the access-om package.

Alternatively the spack.yaml could just contain all the individual packages that make up access-om2. The advantage of making it a bundle is that it could be installed independently for development purposes.

harshula commented 1 year ago

Notes

working_dir(dirname, kwargs)

This is a Python Context Manager that makes it easier to work with subdirectories in builds. You use this with the Python with statement to change into a working directory, and when the with block is done, you change back to the original directory. Think of it as a safe pushd / popd combination, where popd is guaranteed to be called at the end, even if exceptions are thrown.

harshula commented 1 year ago

Notes How to install a package and define dependency and compiler versions:

# spack install oasis3-mct ^openmpi@4.0.2  ^netcdf-c@4.7.4 ^netcdf-fortran@4.5.2 %intel@2019.5.281
harshula commented 1 year ago

Installing Perl package in Spack with an old Intel compiler (2019.5.281)

  1. You may need:
    
    --- a/var/spack/repos/builtin/packages/perl/package.py
    +++ b/var/spack/repos/builtin/packages/perl/package.py
    @@ -179,6 +179,10 @@ def patch(self):
         os.chmod("lib/perlbug.t", 0o644)
         filter_file("!/$B/", "! (/(?:$B|PATH)/)", "lib/perlbug.t")
aidanheerdegen commented 1 year ago

How do we want to refer to ACCESS-OM versions? Clearly there is a "major version" (ACCESS-OM2, ACCESS-OM3), but below that we have no standard for versioning.

Options:

  1. Semantic versioning

This is designed with a particular model of software development in mind. The first requirement is

Software using Semantic Versioning MUST declare a public API. This API could be declared in the code itself or exist strictly in documentation. However it is done, it SHOULD be precise and comprehensive.

I'm unsure how well this fits a complex scientific model such as this. The API in this case would be the configuration which is used to run the model. Regardless this is largely the default method of versioning.

  1. Calendar versioning

Calendar versioning has a venerable history. I personally think it is probably a better fit for a project likes ACCESS-OM2 which lacks the cohesive nature of many traditional software projects. It has the advantage of encoding useful information in the version about how old it is. It is supported in spack.

  1. Hash-based versioning

Probably best avoided. It doesn't provide much semantic meaning, and calendar versioning can easily be mapped on to specific commit hashes with spack or via tags in the repo.

micaeljtoliveira commented 1 year ago

Calendar sounds good!

I would use YYYY.MM.MICRO, so it looks like this:

ACCESS-OM3-2023.03.002

This is the version of ACCESS-OM3 initially released in March 2023, with micro version 2. Note the use of three digits for the micro version, so that it's not mistaken for the day.

You can also introduce a release tag for the Spack built stuff. For example:

ACCESS-OM3-2023.03.002-1

This is the same version of ACCESS-OM3 as before, but build with Spack, release 1. Why the Spack release version? So that you can signal that the way in which the package is built has changed, but not the sources. So ACCESS-OM3-2023.03.002-2 would correspond to exactly the same sources, but with some changes in how it was compiled.

aidanheerdegen commented 1 year ago

I think I am in agreement with @micaeljtoliveira, and thanks for providing the detailed justification.

I can see the logic of versioning the spack release, but I don't know how much utility it has in practice. If we changed the way spack built the model the spack hash would change, signalling it was a different build, and so would be reinstalled.

My thought was that a model configuration would target a specific build of the model to ensure reproducibility. It would not automatically pick up a new build (with a new hash) unless it was deliberately changed to do so.

micaeljtoliveira commented 1 year ago

You can think of the -1 as a tag. Just like one uses tags to map to commits hashes, you can use this release version as a mapping to the executable hash. The reason to use it is exactly the same as you mentioned above: the hash doesn't provide much semantic meaning.

micaeljtoliveira commented 1 year ago

To complement the above, it's a standard practice to have this sort of "tag" or "release" number in packages (e.g. rpms, debs, etc). Among other things, it allows to easily determine what is the newest version of a given package, even when there are no changes from upstream.

aidanheerdegen commented 1 year ago

To complement the above, it's a standard practice to have this sort of "tag" or "release" number in packages (e.g. rpms, debs, etc). Among other things, it allows to easily determine what is the newest version of a given package, even when there are no changes from upstream.

Yes, I'm familiar with them from conda, and you've convinced me I think, but I have two reservations:

  1. 2023.03.002-1 is a helluva version number
  2. Not sure about the order of operation. I think we'd need to trigger a build or concretize step to see that it resolved to a different hash, then bump the release number, which itself would result in resolving to a different hash.

Regarding the 2., we could always trigger a new build by bumping the release number, regardless if there was any other changes to underlying build infrastructure.

micaeljtoliveira commented 1 year ago

@aidanheerdegen I'm afraid I'm not sure how this would work in practice. Or to put it in another way, where exactly this release number would be introduced. In rpm's or deb's (not sure about conda, as I'm not very familiar with it), those numbers are introduced in the corresponding spec files, which live in a place separated from the actual build tool.

So maybe one should only introduce it if it becomes obvious at some point that it's useful and there's an obvious place to put them.

Actually, how are you planning to "release" the spack built binaries?

harshula commented 1 year ago

Hi @aidanheerdegen , @micaeljtoliveira , @echus ,

https://github.com/ACCESS-NRI/libaccessom2/commit/aac70d70f43e5f1ac809e3ba222a2aa2308c5175:

project(yatm VERSION 2.0.202212 LANGUAGES Fortran)

https://github.com/ACCESS-NRI/cice5/commit/01b1c202fa89c6da7dd9a1e6c48eab06b728e1d2:

set version='202301'

mom5 still has set new_hash=`git rev-parse HEAD` in exp/update_version.csh

harshula commented 1 year ago

Notes

https://spack-tutorial.readthedocs.io/en/latest/tutorial_spack_scripting.html

The spack python command gives you access to all of Spack’s internal APIs, allowing you to write more complex queries, for example.

aekiss commented 1 year ago

I think a consistent versioning system would be really valuable (for documentation in papers, amongst other things), and is long overdue. Up to this point I've made disorganised attempts via tags, and had a 2.0 in mind but never quite got around to tagging it.

I think I like the date approach. My only concern is that I've been using matching tags for the codebase https://github.com/COSIMA/access-om2/tags and all the 6 supported configurations, e.g. https://github.com/COSIMA/01deg_jra55_iaf/tags as these typically need to be kept in sync, e.g. if new namelist parameters or diagnostics are introduced, and using the same tag for all repos makes this obvious. How would that sort of synchronised tagging across repos work with dates, e.g. if they're updated on different days (e.g. a bugfix in one of the repos) but should be thought of as parts of the same release?

aekiss commented 1 year ago

If we increment all tags if any of the related repos is updated, then different tags would refer to the same commit in all the unchanged repos. I guess that's ok, but kinda confusing.

aekiss commented 1 year ago

Alternatively, maybe the config tags could be compound, and consist of the codebase tag with a suffix that applies just to the config? That would make the relationships clear and allow tags to increment more independently, but the tags (version numbers) could get quite large.

aidanheerdegen commented 1 year ago

spack has support for sophisticated dependency specification. Specifically forthis case it is possible to use depends_on in an access-om package to require specific versions of dependencies for the version of access-om being built.

As an example ,from the acts package file:

    depends_on("autodiff @0.5.11:0.5.99", when="@1.2:16 +autodiff")
    depends_on("autodiff @0.6:", when="@17: +autodiff")

which says for all versions of this package from 1.2 to 16.* use autodiff versions 0.5.11 to 0.5.99. For versions 17 and above use autodiff >= 0.6.

So it shouldn't be necessary to tag everything the same.

@micaeljtoliveira asked

Actually, how are you planning to "release" the spack built binaries?

Ay, and there's the rub. See above https://github.com/ACCESS-NRI/ACCESS-OM/issues/6#issuecomment-1272670089

The answer is: TBD.

harshula commented 1 year ago

ACCESS-OM2 bundle package: https://github.com/ACCESS-NRI/spack_packages/blob/development/packages/access-om2/package.py

harshula commented 1 year ago

Notes Spack v0.20.1 built openmpi 4.0.2 appears to trip over https://github.com/spack/spack/issues/30906 when testing Spack v0.20.1 built access-om2. The workaround was to use a newer version of openmpi. e.g. 4.1.5 avoided the problem.

Using the system openmpi as an "external" from Spack appears to trip over the non-standard directory structure on gadi. e.g. /apps/openmpi/4.0.2/lib/Intel/libmpi_usempif08.so

aidanheerdegen commented 1 year ago

Using the system openmpi as an "external" from Spack appears to trip over the non-standard directory structure on gadi. e.g. /apps/openmpi/4.0.2/lib/Intel/libmpi_usempif08.so

What is the error message for this? It is valuable to have it explicitly documented in case there is a similar error again (likely).

harshula commented 1 year ago
     13    -- Check for working Fortran compiler: /apps/openmpi/4.0.2/bin/mpif90 - broken
  >> 14    CMake Error at $HOME/spack-upstream.git/opt/spack/linux-rocky8-x86_64/intel-2019.5.281/cmake-3.26.3-rsetek2vcrc7sm4kcoobqg25z75czm4u/share/cmake-3.26/Modules/CMakeTestFortranCompiler.cmake:59 (message):
     15      The Fortran compiler
     16    
     17        "/apps/openmpi/4.0.2/bin/mpif90"
     18    
     19      is not able to compile a simple test program.
     20         31        /apps/openmpi/4.0.2/bin/mpif90 CMakeFiles/cmTC_050e5.dir/testFortranCompiler.f.o -o cmTC_050e5
  >> 32    /bin/ld: cannot find -lmpi_usempif08
  >> 33        /bin/ld: cannot find -lmpi_usempi_ignore_tkr
  >> 34        /bin/ld: cannot find -lmpi_mpifh
  >> 35        gmake[1]: *** [CMakeFiles/cmTC_050e5.dir/build.make:99: cmTC_050e5] Error 1
harshula commented 1 year ago

Notes

$ /apps/openmpi/4.1.5/bin/ompi_info -a
...

Configure command line: '--prefix=/apps/openmpi-mofed5.8-pbs2021.1/4.1.5' '--disable-dependency-tracking' '--disable-heterogeneous' '--disable-ipv6' '--enable-orterun-prefix-by-default' '--enable-sparse-groups' '--enable-mpi-fortran' '--enable-mpi-cxx' '--enable-mpi1-compatibility' '--enable-shared' '--disable-static' '--disable-wrapper-rpath' '--disable-wrapper-runpath' '--disable-mpi-java' '--enable-mca-static' '--enable-hwloc-pci' '--enable-visibility' '--with-zlib' '--with-cuda=/apps/cuda/12.0.0' '--without-pmi' '--with-ucx=/apps/ucx/1.14.0' '--without-verbs' '--without-verbs-usnic' '--without-portals4' '--without-ugni' '--without-usnic' '--without-ofi' '--without-cray-xpmem' '--with-xpmem' '--with-knem=/opt/knem' '--with-cma' '--without-x' '--without-memkind' '--without-cray-pmi' '--without-alps' '--without-flux-pmi' '--without-udreg' '--without-lsf' '--without-slurm' '--with-tm=/opt/pbs/default' '--without-sge' '--without-moab' '--without-singularity' '--without-fca' '--with-hcoll=/apps/hcoll/4.8.3220' '--with-ucc=/apps/ucc/1.1.0' '--without-ime' '--without-pvfs2' '--with-lustre' '--with-io-romio-flags=--with-file-system=lustre+ufs' '--without-psm' '--without-psm2' '--without-mxm' '--disable-mem-debug' '--disable-mem-profile' '--disable-picky' '--disable-debug' '--disable-timing' '--disable-event-debug' '--disable-memchecker' '--disable-pmix-timing' '--with-mpi-param-check=runtime' '--with-oshmem-param-check=never' '--without-valgrind'

harshula commented 1 year ago

Notes @dsroberts recommended trying these openmpi configure/build flags to improve Spack openmpi on gadi support:

--with-ucx=/apps/ucx/1.14.0
--with-ucc=/apps/ucc/1.1.0
--with-lustre
--with-tm=/opt/pbs/default
--with-hcoll=/apps/hcoll/4.8.3220 (Not sure if required)
harshula commented 1 year ago

[Updated: 10/08/2023]

Notes Spack has relevant variants:

    variant(
        "fabrics",
        values=disjoint_sets(
            ("auto",),
            (
                "psm",
                "psm2",
                "verbs",
                "mxm",
                "ucx",
                "ofi",
                "fca",
                "hcoll",
                "xpmem",
                "cma",
                "knem",
            ),  # shared memory transports
        ).with_non_feature_values("auto", "none"),
        description="List of fabrics that are enabled; " "'auto' lets openmpi determine",
    )
...
    variant("lustre", default=False, description="Lustre filesystem library support")
...
    variant(
        "schedulers",
        values=disjoint_sets(
            ("auto",), ("alps", "lsf", "tm", "slurm", "sge", "loadleveler")
        ).with_non_feature_values("auto", "none"),
        description="List of schedulers for which support is enabled; "
        "'auto' lets openmpi determine",
    )

For example, the following compiles on gadi:

$ spack install openmpi@4.1.5 fabrics=ucx
harshula commented 1 year ago

Using the system openmpi as an "external" from Spack appears to trip over the non-standard directory structure on gadi. e.g. /apps/openmpi/4.0.2/lib/Intel/libmpi_usempif08.so

     13    -- Check for working Fortran compiler: /apps/openmpi/4.0.2/bin/mpif90 - broken
  >> 14    CMake Error at $HOME/spack-upstream.git/opt/spack/linux-rocky8-x86_64/intel-2019.5.281/cmake-3.26.3-rsetek2vcrc7sm4kcoobqg25z75czm4u/share/cmake-3.26/Modules/CMakeTestFortranCompiler.cmake:59 (message):
     15      The Fortran compiler
     16    
     17        "/apps/openmpi/4.0.2/bin/mpif90"
     18    
     19      is not able to compile a simple test program.
     20         31        /apps/openmpi/4.0.2/bin/mpif90 CMakeFiles/cmTC_050e5.dir/testFortranCompiler.f.o -o cmTC_050e5
  >> 32    /bin/ld: cannot find -lmpi_usempif08
  >> 33        /bin/ld: cannot find -lmpi_usempi_ignore_tkr
  >> 34        /bin/ld: cannot find -lmpi_mpifh
  >> 35        gmake[1]: *** [CMakeFiles/cmTC_050e5.dir/build.make:99: cmTC_050e5] Error 1

Thanks @rxy900 for sending me your packages.yaml. I added the following lines to my packages.yaml to avoid the error:

+  gmake:
+    externals:
+    - spec: gmake@4.2.1
+      prefix: /usr
+  pkgconf:
+    externals:
+    - spec: pkgconf@1.4.2
+      prefix: /usr
+  diffutils:
+    externals:
+    - spec: diffutils@3.6
+      prefix: /usr
+    buildable: false
+  cmake:
+    externals:
+    - spec: cmake@3.24.2
+      prefix: /apps/cmake/3.24.2
+      modules:
+      - cmake/3.24.2
+    buildable: false
+  openmpi:
+    externals:
+    - spec: openmpi@4.0.2
+      prefix: /apps/openmpi/4.0.2
+      modules:
+      - openmpi/4.0.2
+    buildable: false
harshula commented 1 year ago

Using the system openmpi 4.1.5 as an "external" from Spack: $ spack install libaccessom2 ^netcdf-c@4.7.4 ^netcdf-fortran@4.5.2 ^openmpi@4.1.5 %intel@2021.6.0

1 error found in build log:
     5     -- Detecting Fortran compiler ABI info - done
     6     -- Check for working Fortran compiler: $SPACKDIR/lib/spack/env/intel/ifort - skipped
     7     ---- PROJECT_VERSION: '2.0.202212'
     8     ---- FQDN: gadi-login-08.gadi.nci.org.au
     9     ---- NUMBER_OF_LOGICAL_CORES: 48
     10    -- Could NOT find MPI_Fortran (missing: MPI_Fortran_F77_HEADER_DIR M
           PI_Fortran_MODULE_DIR) (found version "3.1")
  >> 11    CMake Error at /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindPacka
           geHandleStandardArgs.cmake:230 (message):
     12      Could NOT find MPI (missing: MPI_Fortran_FOUND) (found version "3.
           1")
     13    Call Stack (most recent call first):
     14      /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindPackageHandleStand
           ardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
     15      /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindMPI.cmake:1835 (fi
           nd_package_handle_standard_args)
     16      CMakeLists.txt:34 (find_package)
     17    
aidanheerdegen commented 1 year ago

How have you defined the openmpi external? Does it include a module load?

I don't think the CMake is very robust here

https://github.com/COSIMA/libaccessom2/blob/a227a616fac7a7d4795d2ebcca750292b6004683/CMakeLists.txt#L19-L32

Worth sprinkling some pc-config magic over it? Or is that just another rabbit hole?

harshula commented 1 year ago

I presume the above is due to the non-standard directory structure that is not understood by CMake. e.g. /apps/openmpi/4.1.5/lib/Intel/

Module loading is enabled in etc/spack/defaults/packages.yaml:

    - spec: openmpi@4.1.5
      prefix: /apps/openmpi/4.1.5
      modules:
      - openmpi/4.1.5

LD_LIBRARY_PATH=/apps/openmpi/4.1.5/lib:/apps/openmpi/4.1.5/lib/profilers:/apps/intel-ct/2021.6.0/compiler/linux/compiler/lib/intel64_lin:/apps/intel-ct/2021.6.0/compiler/linux/lib/x64:/apps/intel-ct/2021.6.0/compiler/linux/lib; export LD_LIBRARY_PATH

rxy900 commented 1 year ago

Maybe you could try to set these variables export CC=mpicc export FC=mpif90 export CXX=mpicxx

On 4 Jul 2023, at 4:21 pm, Harshula Jayasuriya @.**@.>> wrote:

I presume the above is due to the non-standard directory structure that is not understood by CMake. e.g. /apps/openmpi/4.1.5/lib/Intel/

Module loading is enabled in etc/spack/defaults/packages.yaml:

- spec: ***@***.***
  prefix: /apps/openmpi/4.1.5
  modules:
  - openmpi/4.1.5

LD_LIBRARY_PATH=/apps/openmpi/4.1.5/lib:/apps/openmpi/4.1.5/lib/profilers:/apps/intel-ct/2021.6.0/compiler/linux/compiler/lib/intel64_lin:/apps/intel-ct/2021.6.0/compiler/linux/lib/x64:/apps/intel-ct/2021.6.0/compiler/linux/lib; export LD_LIBRARY_PATH

— Reply to this email directly, view it on GitHubhttps://github.com/ACCESS-NRI/ACCESS-OM/issues/6#issuecomment-1619570711, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AESFTSUV4CXKUW7DKPAW4WTXOOY53ANCNFSM6AAAAAAQHMDQLY. You are receiving this because you were mentioned.Message ID: @.***>

harshula commented 1 year ago

Hi @rxy900 ,

Same error:

$ export CC=mpicc
$ export FC=mpif90
$ export CXX=mpicxx
$ spack install libaccessom2 ^netcdf-c@4.7.4 ^netcdf-fortran@4.5.2 ^openmpi@4.1.5 %intel@2021.6.0

1 error found in build log:
     5     -- Detecting Fortran compiler ABI info - done
     6     -- Check for working Fortran compiler: $SPACKDIR/lib/spack/env/intel/ifort - skipped
     7     ---- PROJECT_VERSION: '2.0.202212'
     8     ---- FQDN: gadi-login-08.gadi.nci.org.au
     9     ---- NUMBER_OF_LOGICAL_CORES: 48
     10    -- Could NOT find MPI_Fortran (missing: MPI_Fortran_F77_HEADER_DIR M
           PI_Fortran_MODULE_DIR) (found version "3.1")
  >> 11    CMake Error at /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindPacka
           geHandleStandardArgs.cmake:230 (message):
     12      Could NOT find MPI (missing: MPI_Fortran_FOUND) (found version "3.
           1")
     13    Call Stack (most recent call first):
     14      /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindPackageHandleStand
           ardArgs.cmake:594 (_FPHSA_FAILURE_MESSAGE)
     15      /apps/cmake/3.24.2/share/cmake-3.24/Modules/FindMPI.cmake:1835 (fi
           nd_package_handle_standard_args)
     16      CMakeLists.txt:34 (find_package)
     17    

The above environment variables are likely overridden by Spack:

spack-build-env-mods.txt:

export CC=$SPACKDIR/lib/spack/env/intel/icc;
export FC=$SPACKDIR/lib/spack/env/intel/ifort;
export CXX=$SPACKDIR/lib/spack/env/intel/icpc;

spack-build-env.txt:

CC=$SPACKDIR/lib/spack/env/intel/icc; export CC
FC=$SPACKDIR/lib/spack/env/intel/ifort; export FC
CXX=$SPACKDIR/lib/spack/env/intel/icpc; export CXX
harshula commented 1 year ago

@ScottWales mentioned this last Friday: https://git.nci.org.au/bom/ngm/spack-environments/-/blob/jopa/repos/bom-ngm/packages/nci-openmpi/package.py

harshula commented 1 year ago

[Updated: 28/07/2023]

Notes

Convert the RPATH setting into a RUNPATH setting: $ chrpath -c <binary>

Spack built executables link against openmpi via RPATH:

$ readelf -d opt/spack/linux-rocky8-cascadelake/intel-2019.5.281/cice5-development-m7drkjh74tlgoviumzav3h5puwn6x3wn/bin/cice_auscom_360x300_24x1_24p.exe  | less
...
 0x000000000000000f (RPATH)              Library rpath: [[...]$GDATA/spack-microarchitectures.git/opt/spack/linux-rocky8-cascadelake/intel-2019.5.281/openmpi-4.1.5-ldghsy4dsty7ibxusz3jip6efd53pmfk/lib [...]]

That's the reason we need to use chrpath to convert the executable to use RUNPATH instead of RPATH when we build against a Spack built openmpi, but want to run against a different openmpi.