UCL-ARC / hpc-spack

Solutions - HPC's Spack config
MIT License
1 stars 2 forks source link

Initial software: Myriad packages #49

Open heatherkellyucl opened 11 months ago

heatherkellyucl commented 11 months ago

From #44

Samtools, BCFTools, HTSlib: https://github.com/UCL-RITS/rcps-buildscripts/issues/532

Samtools, Bwa, Bedtools, Vcftools, picard, GATK (requested by Biosciences) https://ucldata.atlassian.net/browse/AHS-139

Alphafold (would be nice to not have to tell people to go and track down the working containers): https://github.com/UCL-RITS/rcps-buildscripts/issues/529 https://github.com/UCL-RITS/rcps-buildscripts/issues/463

HDF5 Netcdf BEAST 2.7: https://github.com/UCL-RITS/rcps-buildscripts/issues/498

heatherkellyucl commented 11 months ago

On Myriad, continuing from existing site with compilers only installed:

source spacksites/myriad-utilities/init-spacksites-on-myriad.sh
eval $(spacksites/spacksites spack-setup-env hk-initial-stack)

Looking at adding these to the environment

htslib@1.16 %gcc@12.2.0
samtools@1.16.1 %gcc@12.2.0
bcftools@1.16 %gcc@12.2.0

(want them all to be 1.16 or we get two htslibs, since samtools 1.16.1 wants htslib 1.16).

heatherkellyucl commented 11 months ago

Fun times, ran into https://github.com/spack/spack/issues/35197 with gzip as an htslib dependency.

Works if you get it to use gzip 1.13 instead with spack checksum --add-to-package gzip (which now should be the one in use as everything less has a vulnerability and needs to be deprecated - To Be Pull Requested).

heatherkellyucl commented 11 months ago

gzip is updated in develop branch (https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/gzip/package.py). Need to work out what exactly we do to make sure we include that version here when creating a new environment. Add to local specs?

giordano commented 11 months ago

Since gzip 1.12- is deprecated, the resolver should basically only consider gzip@1.13 (unless you explicitly use --deprecated. If you want to be super explicit, in your config you could have

packages:
  gzip:
    require: "@1.13:"
heatherkellyucl commented 11 months ago

It is only deprecated in the develop branch though - so we won't see that in 0.20 unless we do something to make sure we get that version of the package.py.

heatherkellyucl commented 11 months ago

(We'll get https://github.com/spack/spack/blob/releases/v0.20/var/spack/repos/builtin/packages/gzip/package.py)

giordano commented 11 months ago

So you wouldn't have gzip@1.13 either :disappointed:

heatherkellyucl commented 11 months ago

Yeah. We will need to solve this in the more general case for anything else where we need to alter the package.py for whatever reason. Which is why I was thinking of a local package location too as part of this repo, where new packages and altered packages go.

heatherkellyucl commented 11 months ago

repos.yaml: https://spack.readthedocs.io/en/latest/repositories.html

heatherkellyucl commented 11 months ago

It'll look through the repos in order, so we put ours first and it'll get gzip from there.

On updating the version of Spack we are pinned to, we would need to check whether we should remove packages from our own repo to return to using Spack's ones.

heatherkellyucl commented 11 months ago

bcftools@1.16 depends on python and at least a few packages :sadface:

Didn't want to mess with python yet, might have to.

Actually, this might be fine, we just wouldn't have created our view and associated module yet.

heatherkellyucl commented 11 months ago

James started a local repo as repos/dev in the top level in here. We probably want a repos/ucl too for when things are non-dev.

heatherkellyucl commented 11 months ago

Got these after installing all three:

-- linux-rhel7-skylake_avx512 / gcc@12.2.0 ----------------------
bcftools@1.16                       libxml2@2.10.3        py-pip@23.0
berkeley-db@18.1.40                 nasm@2.15.05          py-pybind11@2.10.1
bzip2@1.0.8                         ncurses@6.4           py-pyparsing@3.0.9
ca-certificates-mozilla@2023-01-10  ninja@1.11.1          py-pyproject-hooks@1.0.0
cmake@3.26.3                        openblas@0.3.23       py-python-dateutil@2.8.2
curl@8.0.1                          openssl@1.1.1t        py-setuptools@63.4.3
diffutils@3.9                       perl@5.36.0           py-setuptools-scm@7.0.5
expat@2.5.0                         pigz@2.7              py-six@1.16.0
freetype@2.11.1                     pkgconf@1.9.5         py-tomli@2.0.1
gdbm@1.23                           py-build@0.10.0       py-typing-extensions@4.5.0
gettext@0.21.1                      py-certifi@2022.12.7  py-wheel@0.37.1
gmake@4.4.1                         py-contourpy@1.0.5    python@3.10.10
gzip@1.13                           py-cppy@1.2.1         qhull@2020.2
htslib@1.16                         py-cycler@0.11.0      re2c@2.2
libbsd@0.11.7                       py-cython@0.29.33     readline@8.2
libdeflate@1.10                     py-flit-core@3.7.1    samtools@1.16.1
libffi@3.4.4                        py-fonttools@4.37.3   sqlite@3.40.1
libiconv@1.17                       py-kiwisolver@1.4.4   tar@1.34
libjpeg-turbo@2.1.5                 py-matplotlib@3.7.1   util-linux-uuid@2.38.1
libmd@1.0.4                         py-numpy@1.24.3       xz@5.4.1
libpng@1.6.39                       py-packaging@23.0     zlib@1.2.13
libxcrypt@4.4.33                    py-pillow@9.5.0       zstd@1.5.5
heatherkellyucl commented 11 months ago

We now have these in this repo: https://github.com/UCL-ARC/hpc-spack/blob/0.20/repos/ucl/repo.yaml https://github.com/UCL-ARC/hpc-spack/blob/0.20/repos/ucl/packages/gzip/package.py

heatherkellyucl commented 11 months ago

Added to spacksites/settings/initial_site_repos.yaml

Right now, in my site created without that addition:

spack repo list
==> 2 package repositories.
ucl.arc.hpc.dev    /lustre/scratch/scratch/ccspapp/spack/0.20/hpc-spack/repos/dev
builtin            /lustre/shared/ucl/apps/spack/0.20/hk-initial-stack/spack/var/spack/repos/builtin
heatherkellyucl commented 11 months ago

bwa@0.7.17 %gcc@12.2.0 - no deps, goes fine.

vcftools: already have all the dependencies but need to update the versions available, get from develop vcftools@0.1.16 %gcc@12.2.0 Needs some checking as the one from develop depends on zlib-api.

bedtools2@2.31.0 %gcc@12.2.0 - already got all the deps.

heatherkellyucl commented 11 months ago

picard@2.26.2 %gcc@12.2.0 - one dep, openjdk11

gatk@4.4.0.0 %gcc@12.2.0 - from develop, one dep, openjdk17

heatherkellyucl commented 11 months ago

Looked at vcftools@0.1.16 %gcc@12.2.0 again - for anything in develop that depends on zlib-api, we need that to depend on zlib instead for 0.20. Updated our repos/ucl/packages/vcftools/package.py to do that and added a comment as to why.

heatherkellyucl commented 11 months ago

HDF5: we currently have a serial version with fortran and cxx enabled and an mpi version with fortran enabled. (We have an old serial+threadsafe install, but I think we can ignore this for now unless something else brings it in as a dependency).

hdf5@1.14.1-2 +cxx +fortran -mpi - got all the deps

hdf5@1.14.1-2 +fortran +mpi - waiting on the MPI passthrough.

heatherkellyucl commented 11 months ago

NetCDF: we have netcdf with HDF5 (Spack calls it netcdf-c), netcdf-fortran, netcdf-c++. No mpi.

heatherkellyucl commented 11 months ago

Notes from end of yesterday: had some fun with hdf5 versions. (By default it will try to build with MPI on, even for a netcdf that specifies no mpi).

I had tried installing these two and ended up with slightly different concretisations and so two different hashes:

hdf5@1.14.1-2 +cxx +fortran -mpi %gcc@12.2.0
netcdf-c@4.9.2 -mpi %gcc@12.2.0 ^hdf5@1.14.1-2 +cxx +fortran -mpi %gcc@12.2.0

The ^hdf5@1.14.1-2 +cxx +fortran -mpi %gcc@12.2.0 part specifies the dependency specs.

They were

==> hdf5: Successfully installed hdf5-1.14.1-2-523xubmhtypec6qbccenfcd6yrjkai56
==> Installing hdf5-1.14.1-2-n2ejgnpxlkqidimamznciimgs2s22z3c for netcdf

This lets you see the difference:

spack find -v hdf5
-- linux-rhel7-skylake_avx512 / gcc@12.2.0 ----------------------
hdf5@1.14.1-2+cxx+fortran~hl~ipo~java~map~mpi+shared~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make
hdf5@1.14.1-2+cxx+fortran+hl~ipo~java~map~mpi+shared~szip~threadsafe+tools api=default build_system=cmake build_type=Release generator=make

It was ~hl and +hl.

  hl [off]                --                                on, off                 Enable the high-level library

Let it go with the variants that netcdf uses. Uninstall the hash I don't want:

spack uninstall /523xubmhtypec6qbccenfcd6yrjkai56
heatherkellyucl commented 11 months ago

I think we can leave out netcdf-cxx/netcdf-cxx4 for now, as REPAST is the only thing we currently have that depends on it (netcdf-cxx, the legacy version).

heatherkellyucl commented 11 months ago

Current Spack has BEAST2 2.6.7, need to get 2.7.4 from develop.

heatherkellyucl commented 11 months ago

For this environment, I have changed my /shared/ucl/apps/spack/0.20/hk-initial-stack/spack/var/spack/environments/compilers/spack.yaml to add

  concretizer:
    unify: when_possible
  view: False

This is because it is fine to have Java 11 and Java 17 installed if necessary, or different variants of a package when those are needed, and we do not need a view for all these packages together - it is not intended that someone automatically have them all in their environment at once. (A view requires that everything can be uniquely symlinked into the view directory).

heatherkellyucl commented 9 months ago

Here's the list of everything I added to this environment (which needs recreating):

gzip@1.13 %gcc@12.2.0
htslib@1.16 %gcc@12.2.0
hdf5@1.14.1-2%gcc@12.2.0+cxx+fortran~mpi
samtools@1.16.1 %gcc@12.2.0 
bcftools@1.16 %gcc@12.2.0
bwa@0.7.17 %gcc@12.2.0 
vcftools@0.1.16 %gcc@12.2.0
picard@2.26.2 %gcc@12.2.0
gatk@4.4.0.0 %gcc@12.2.0 
netcdf-c@4.9.2 -mpi %gcc@12.2.0 ^hdf5@1.14.1-2 +cxx +fortran -mpi %gcc@12.2.0
netcdf-fortran@4.6.1 %gcc@12.2.0 ^hdf5@1.14.1-2 +cxx +fortran -mpi %gcc@12.2.0
beast2@2.7.4 %gcc@12.2.0

Anything not available in spack 0.20 comes from repos/ucl/packages/

The only additional package needed is a version of alphafold.

As per comment directly above, it needs this to allow two versions of Java to be installed.

  concretizer:
    unify: when_possible
  view: False
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512-gcc-11.2.1-berkeley-db-18.1.40-diffrysevtavcyynmwqh3juzlophaclr.spec.json.sig
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512-gcc-11.2.1-gcc-12.2.0-pzbagboey5iumrz237ldmjmvtdsaazrb.spec.json.sig
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512-gcc-11.2.1-texinfo-6.5-vk4bsesfczjjrhl7d3rmm4huwjsezaxn.spec.json.sig
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512/gcc-11.2.1/bzip2-1.0.8/linux-rhel7-skylake_avx512-gcc-11.2.1-bzip2-1.0.8-iwbmigw4lpwnlkezmhgqczva6tf2hfxx.spack
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512/gcc-11.2.1/gcc-12.2.0/linux-rhel7-skylake_avx512-gcc-11.2.1-gcc-12.2.0-pzbagboey5iumrz237ldmjmvtdsaazrb.spack
/lustre/shared/ucl/apps/spack/0.20/buildcache/build_cache/linux-rhel7-skylake_avx512/gcc-11.2.1/mpfr-4.1.0/linux-rhel7-skylake_avx512-gcc-11.2.1-mpfr-4.1.0-ilbyeu4bmajiybjcrakptc2ptvz3yfss.spack

@giordano will need to either create the concretised environment in his own space and we'll install the resulting .yaml as ccspapp, or we can give him access to ccspapp and he can do the build in /shared/ucl/apps/spack/0.20/ and when complete can write it to the cache himself. May be better to start off in own space.

We have not looked at the module files created yet - all of the above should have modules created for them as they are requested explicitly. Will need to check on LIBRARY_PATH, LD_LIBRARY_PATH for the libraries as they are likely to be wanted in external compilations.

Need to check if anything else needs to be a top-level spec with a module generated. Like Java.

giordano commented 9 months ago

A quick summary of my work so far:

This is a kind of provisional environment I got so far, but do not that I haven't attempted to rebuild all of it from scratch, in particular I haven't rebuilt anything with %gcc@13.1.0 apart from the py-alphafold stack (but all other packages were successful with %gcc@12.2.0, this may be useful information in case we want to ditch py-alphafold from this round)

spack:
  specs:
  - matrix:
    - - 'samtools@1.16.1'
      - 'bcftools@1.16'
      - 'htslib@1.17'
      - 'bwa@0.7.17'
      - 'bedtools2@2.31.0'
      - 'vcftools@0.1.16'
      - 'picard@2.26.2'
      - 'gatk@4.4.0.0'
      - 'py-alphafold@2.2.4 ^cuda@12.2.0'
      - 'hdf5@1.14.1-2 +cxx +fortran -mpi +hl'
      - 'hdf5@1.14.1-2 +fortran +mpi +hl'
      - 'netcdf-c@4.9.2 -mpi'
      - 'netcdf-fortran@4.6.1'
      - 'beast2@2.7.4'
      - 'gzip@1.13'
    - - '%gcc@13.1.0'
  concretizer:
    unify: when_possible
  view: false
  modules:
    prefix_inspections:
      lib: ["LD_LIBRARY_PATH", "LIBRARY_PATH"]
      lib64: ["LD_LIBRARY_PATH", "LIBRARY_PATH"]
giordano commented 9 months ago

Following discussion on Slack with @heatherkellyucl, I removed py-alphafold from the initial stack, switched to using gcc@13.1.0 as first compiler (see #52), and this is the environment I'm currently using:

spack:
  specs:
  - matrix:
    - - 'samtools@1.16.1'
      - 'bcftools@1.16'
      - 'beast2@2.7.4'
      - 'bedtools2@2.31.0'
      - 'bwa@0.7.17'
      - 'cuda@12.2.0'
      - 'gatk@4.4.0.0'
      - 'gzip@1.13'
      - 'hdf5@1.14.1-2 +cxx +fortran -mpi +hl'
      - 'hdf5@1.14.1-2 +fortran +mpi +hl'
      - 'htslib@1.17'
      - 'netcdf-c@4.9.2 -mpi'
      - 'netcdf-fortran@4.6.1'
      - 'picard@2.26.2'
      # - 'py-alphafold@2.2.4'
      - 'vcftools@0.1.16'
    - - '%gcc@13.1.0'
  concretizer:
    unify: when_possible
  view: false
  modules:
    default:
      tcl:
        projections:
          all:  '{name}/{version}/{compiler.name}-{compiler.version}'
          ^mpi: '{name}/{version}-{^mpi.name}/{compiler.name}-{compiler.version}'
    prefix_inspections:
      lib: ["LD_LIBRARY_PATH", "LIBRARY_PATH"]
      lib64: ["LD_LIBRARY_PATH", "LIBRARY_PATH"]

The modules section

For example:

$ module show hdf5/1.14.1-2/gcc-13.1.0 
-------------------------------------------------------------------
./hdf5/1.14.1-2/gcc-13.1.0:

module-whatis    HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.  
module           load pkgconf/1.9.5/gcc-13.1.0 
module           load zlib/1.2.13/gcc-13.1.0 
prepend-path     --delim : LD_LIBRARY_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-o2pm6juc4ewwe5wgj56ukyipf5w4ccuf/lib 
prepend-path     --delim : LIBRARY_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-o2pm6juc4ewwe5wgj56ukyipf5w4ccuf/lib 
prepend-path     --delim : PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-o2pm6juc4ewwe5wgj56ukyipf5w4ccuf/bin 
prepend-path     --delim : PKG_CONFIG_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-o2pm6juc4ewwe5wgj56ukyipf5w4ccuf/lib/pkgconfig 
prepend-path     --delim : CMAKE_PREFIX_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-o2pm6juc4ewwe5wgj56ukyipf5w4ccuf/. 
-------------------------------------------------------------------

$ module show hdf5/1.14.1-2-mpi/gcc-13.1.0 
-------------------------------------------------------------------
./hdf5/1.14.1-2-mpi/gcc-13.1.0:

module-whatis    HDF5 is a data model, library, and file format for storing and managing data. It supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data.  
module           load openmpi/4.1.5/gcc-13.1.0 
module           load pkgconf/1.9.5/gcc-13.1.0 
module           load zlib/1.2.13/gcc-13.1.0 
prepend-path     --delim : LD_LIBRARY_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-fsznlyxgdczelstzmozmthdew4rffdr7/lib 
prepend-path     --delim : LIBRARY_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-fsznlyxgdczelstzmozmthdew4rffdr7/lib 
prepend-path     --delim : PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-fsznlyxgdczelstzmozmthdew4rffdr7/bin 
prepend-path     --delim : PKG_CONFIG_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-fsznlyxgdczelstzmozmthdew4rffdr7/lib/pkgconfig 
prepend-path     --delim : CMAKE_PREFIX_PATH /lustre/scratch/scratch/cceamgi/repo/hpc-spack/mg-initial-stack/spack/opt/spack/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_placeholder__/__spack_path_p/linux-rhel7-skylake_avx512/gcc-13.1.0/hdf5-1.14.1-2-fsznlyxgdczelstzmozmthdew4rffdr7/. 
-------------------------------------------------------------------
ccaefch0523 commented 8 months ago

On Myriad as ccspapp in my spacksite fc-myriad-stack, I uninstalled OpenMPI@4.1.5 with gcc12.2.0 then installed the same OpenMPI version 4.1.5 with gcc13.1.0 and passed the following fabrics list:

 spack install openmpi@4.1.5%gcc@13.1.0 fabrics=psm2,cma,ucx,knem,xpmem,ofi  schedulers=sge
...
==> openmpi: Successfully installed openmpi-4.1.5-vpih2j6ihyyej4z4amggnvzf5uyzgtcj
  Stage: 10.71s.  Autoreconf: 0.01s.  Configure: 5m 57.87s.  Build: 10m 39.48s.  Install: 1m 17.02s.  Post-install: 1.38s.  Total: 18m 7.92s
[+] /lustre/shared/ucl/apps/spack/0.20/fc-myriad-stack/spack/opt/spack/[padded-to-255-chars]/linux-rhel7-skylake_avx512/gcc-13.1.0/openmpi-4.1.5-vpih2j6ihyyej4z4amggnvzf5uyzgtcj

I also installed the NVDIA package nvhpc

 spack install nvhpc
==> Installing nvhpc-23.3-szho75x5rpschjd4e7yopuawdfg3bthh
==> No binary for nvhpc-23.3-szho75x5rpschjd4e7yopuawdfg3bthh found: installing from source
==> Fetching https://developer.download.nvidia.com/hpc-sdk/23.3/nvhpc_2023_233_Linux_x86_64_cuda_multi.tar.gz
%nvhpc==> No patches needed for nvhpc
==> nvhpc: Executing phase: 'install'
==> nvhpc: Successfully installed nvhpc-23.3-szho75x5rpschjd4e7yopuawdfg3bthh
  Stage: 14m 1.52s.  Install: 8m 28.00s.  Post-install: 3m 21.21s.  Total: 25m 50.78s

I will then try installing Gromacs + cuda using nvhpc compiler