Closed shahzebsiddiqui closed 2 years ago
I am looking at this now. Will resolve the issue and post update here.
@shahzebsiddiqui
Can you please verify that you were using these commands on Perlmutter and NOT on Cori?
I can reproduce what you report if I am trying these commands on Cori. But these modules are not for Cori.
This is what happens on Perlmutter:
$perlmutter> module use /global/cfs/cdirs/m3896/shared/modulefiles
$perlmutter> module avail e4s
------------------- /global/cfs/cdirs/m3896/shared/modulefiles ----------------------
e4s/22.05/mvapich2-3.0a e4s/22.05/PrgEnv-gnu
Furthermore, the permissions look OK:
$perlmutter> ls -ld /global/cfs/cdirs/m3896/shared/modulefiles
drwxrwsr-x 4 sameer m3896 4096 Aug 18 07:37 /global/cfs/cdirs/m3896/shared/modulefiles
$perlmutter> ls -ld /global/cfs/cdirs/m3896/shared/modulefiles/e4s
drwxrwsr-x 4 sameer m3896 4096 Aug 18 07:31 /global/cfs/cdirs/m3896/shared/modulefiles/e4s
$perlmutter> ls -ld /global/cfs/cdirs/m3896/shared/modulefiles/e4s/22.05
drwxrwsr-x 3 lpeyrala m3896 16384 Aug 18 07:38 /global/cfs/cdirs/m3896/shared/modulefiles/e4s/22.05
$perlmutter> ls -l /global/cfs/cdirs/m3896/shared/modulefiles/e4s/22.05
total 1
-rw-rw-r-- 1 sameer m3896 1602 Aug 12 11:07 mvapich2-3.0a.lua
-rw-rw-r-- 1 sameer m3896 1104 Jun 17 10:35 PrgEnv-gnu.lua
That's because it using your user account and you are part of the Unix group while I am not. So please make sure the permissions for all directories and files are world readable
Here I am trying as the e4s
user which is not part of the group that owns our module directory.
e4s:login34> groups
e4s m3503 spackecp
e4s:login34> module use /global/cfs/cdirs/m3896/shared/modulefiles
e4s:login34> module avail e4s
------------- /global/cfs/cdirs/m3896/shared/modulefiles -----------------
e4s/22.05/mvapich2-3.0a e4s/22.05/PrgEnv-gnu (D)
Can you confirm you are able to see the modules now from Perlmutter, not Cori?
We have received confirmation from multiple other users, not in our group, that they are able to see the module files.
Closing this as resolved.
Reopening this issue. It's not solved yet. The scope of this is to fix the documentation.
The documentation has the following
However we have the following
~/ module use /global/cfs/cdirs/m3896/shared/modulefiles
~/ module av
----------------------------------------------------------------------------------- /global/cfs/cdirs/m3896/shared/modulefiles -----------------------------------------------------------------------------------
e4s/22.05/mvapich2-3.0a e4s/22.05/PrgEnv-gnu (D) mvapich2/3.0a
There is no module load e4s/22.05/mvapich2
but we have module load e4s/22.05/mvapich2-3.0a
which points to a different stack most likely
I think you should have an updated output of module av
considering there may be difference in the modules generated along with the full path
~/ module load e4s/22.05/mvapich2-3.0a
Lmod is automatically replacing "cray-mpich/8.1.17" with "mvapich2/3.0a".
~/ module av
---------------------------------- /global/cfs/cdirs/m3896/shared/ParaTools/E4S/22.05/mvapich2-3.0a-slurm/spack/share/spack/lmod/cray-sles15-x86_64/mvapich2/3.0a-es35auw/Core -----------------------------------
adios/1.13.1 darshan-runtime/3.3.1 hpx/1.7.1 (D) omega-h/9.34.1 py-petsc4py/3.17.1 sundials/6.2.0
adios2/2.8.0-cuda80 datatransferkit/3.1-rc3 hypre/2.24.0 openpmd-api/0.14.4 py-warpx/22.05-dims2 tasmanian/7.7-openmp
adios2/2.8.0 (D) dyninst/12.1.0-openmp kokkos-kernels/3.6.00-cuda80 papyrus/1.0.2 py-warpx/22.05-dims3 tau/2.31.1-cuda
amrex/22.05 faodel/1.2108.1 lammps/20220107-openmp parsec/3.0.2012 py-warpx/22.05-dimsRZ (D) trilinos/13.0.1
arborx/1.2 fortrilinos/2.0.0 libquo/1.3.1 petsc/3.17.1-cuda80 scr/3.0rc2 veloc/1.5
axom/0.6.1-openmp globalarrays/5.8 mercury/2.1.0 petsc/3.17.1 (D) slate/2021.05.02-cuda80-openmp
butterflypack/2.1.1 hdf5/1.10.7 metall/0.20 precice/2.4.0 slate/2021.05.02-openmp (D)
cabana/0.4.0 heffte/2.2.0-cuda80 mfem/4.4.0 pumi/2.2.7 slepc/3.17.1-cuda80
caliper/2.7.0-cuda80 heffte/2.2.0 (D) nccmp/1.9.0.1 py-cinemasci/1.7.0 slepc/3.17.1 (D)
caliper/2.7.0 (D) hpx/1.7.1-cuda80 nco/5.0.1 py-libensemble/0.9.1 strumpack/6.3.1-openmp
---------------------------------- /global/cfs/cdirs/m3896/shared/ParaTools/E4S/22.05/mvapich2-3.0a-slurm/spack/share/spack/lmod/cray-sles15-x86_64/openmpi/4.1.3-gw3a4bv/Core -----------------------------------
gptune/3.0.0
--------------------------------------------- /global/cfs/cdirs/m3896/shared/ParaTools/E4S/22.05/mvapich2-3.0a-slurm/spack/share/spack/lmod/cray-sles15-x86_64/Core ----------------------------------------------
aml/0.1.0 charliecloud/0.26 flux-core/0.38.0 (D) gotcha/1.0.3 magma/2.6.2-cuda80 papi/6.0.0.1-cuda raja/0.14.0-cuda80-openmp
archer/2.0.0 cmake/3.23.1 gasnet/2022.3.0 kokkos-kernels/3.6.00-openmp (D) mpark-variant/1.4.0 pdt/3.25.1 superlu/5.3.0
argobots/1.1 darshan-util/3.3.1 ginkgo/1.4.0-cuda80-openmp kokkos/3.6.00-openmp mvapich2/3.0a (L,D) plasma/21.8.29 swig/4.0.2-fortran
bolt/2.0 flit/2.1.0 ginkgo/1.4.0-openmp (D) legion/21.03.0-cuda80-cuda nrm/0.1.0 py-jupyterhub/1.4.1 umap/2.1.0
chai/2.4.0 flux-core/0.38.0-cuda gmp/6.2.1 legion/21.03.0 (D) nvhpc/22.3 qthreads/1.16 zfp/0.5.5-cuda80
Ahh, I see. Please see this PR and if you approve, let us merge it.
Please describe the issue
This page needs to be updated https://e4s.readthedocs.io/en/latest/deployment.html#perlmutter
Shown below are the available modules since we hid a few mvapich2 modules
Also i noticed that
e4s/22.05/mvapich2
module along withe4s/22.05/PrgEnv-gnu
is not accessible to everyone. We should remove this from the documentation or fix the permissions.I am pretty sure @eugeneswalker may have a
umask 007
that is causing this issue. Perhaps you can change it toumask 002
to address this problem. I think you wantg+w
to write to the directory viam3896
but still haveo+rx
permission