sdsc / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
https://spack.io
Other
0 stars 4 forks source link

SDSC: PKG - expanse/0.17.3/gpu/b - Missing NAMD GPU (example application) #59

Closed nwolter closed 1 year ago

mkandes commented 1 year ago

@nwolter - This might be an easy one to check off before Friday. I will retest as well.

[mkandes@login02 ~]$ module spider namd/2.14/ruwh4yb

----------------------------------------------------------------------------
  namd/2.14: namd/2.14/ruwh4yb
----------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "namd/2.14/ruwh4yb" module is available to load.

      gpu/0.17.3b  gcc/10.2.0/i62tgso

    Help:
      NAMDis a parallel molecular dynamics code designed for high-performance
      simulation of large biomolecular systems.

[mkandes@login02 ~]$
[mkandes@login02 ~]$ ls /cm/shared/examples/sdsc/namd/gpu
namd-1GPU-1node.sb  namd-4GPUs-1node.sb
[mkandes@login02 ~]$
mkandes commented 1 year ago

As a side note, I've also downloaded the latest NGC NAMD v3.0 container to Expanse. https://catalog.ngc.nvidia.com/orgs/hpc/containers/namd

[mkandes@login01 ~]$ !511
srun --partition=gpu-debug  --pty --account=use300 --nodes=1 --ntasks-per-node=10 --mem=92G --gpus=1 --time=00:30:00 --wait=0 --export=ALL /bin/bash
srun: job 24808966 queued and waiting for resources
srun: job 24808966 has been allocated resources
[mkandes@exp-7-59 ~]$ module reset
Resetting modules to system default. Reseting $MODULEPATH back to system default. All extra directories will be removed from $MODULEPATH.
[mkandes@exp-7-59 ~]$ module load singularitypro
[mkandes@exp-7-59 ~]$ singularity pull docker://nvcr.io/hpc/namd:3.0-beta2
2023/08/23 19:42:35 Unsolicited response received on idle HTTP channel starting with "0\r\n\r\n"; err=<nil>
INFO:    Converting OCI blobs to SIF format
WARNING: 'nodev' mount option set on /tmp, it could be a source of failure during build process
INFO:    Starting build...
2023/08/23 19:42:36 Unsolicited response received on idle HTTP channel starting with "0\r\n\r\n"; err=<nil>
Getting image source signatures
Copying blob 309489a57f7e done  
Copying blob b787be75b30b done  
Copying blob c7f41ce506de done  
Copying blob 70fb7702e8d1 done  
Copying blob d6ad993703c9 done  
Copying blob 846c0b181fff done  
Copying blob 3c15bc552b06 done  
Copying blob d0feb40ce8f5 done  
Copying blob 90b8907a1fe6 done  
Copying blob 7107a82f77dd done  
Copying blob 70ed568596c6 done  
Copying blob f0850a606257 done  
Copying config a0831fecd1 done  
Writing manifest to image destination
Storing signatures
2023/08/23 19:43:03  info unpack layer: sha256:846c0b181fff0c667d9444f8378e8fcfa13116da8d308bf21673f7e4bea8d580
2023/08/23 19:43:04  info unpack layer: sha256:b787be75b30bfd9c59a3d55b00e4155b6a4378990ce488a7e017e7a93388f65e
2023/08/23 19:43:04  info unpack layer: sha256:c7f41ce506deb9e9c01d02cf27044322a215b90db2c6ccf1831ceab5449d5fb9
2023/08/23 19:43:05  info unpack layer: sha256:70fb7702e8d18d269f64d1952903dbf43ba45ab34725c2b64c2afc119c0f6428
2023/08/23 19:43:05  info unpack layer: sha256:d6ad993703c9f43994bfd85999cc4ce2974a76ac46f96d2669cf24f2057f939a
2023/08/23 19:43:05  info unpack layer: sha256:309489a57f7ea4c289b9b0db79596bf1491cbcfbe18559a607f13da2d03a78cd
2023/08/23 19:43:27  info unpack layer: sha256:3c15bc552b069198ac32b23388e94ad7a9a61a12e29e3b8f9af770ab1a4fa2ac
2023/08/23 19:43:27  info unpack layer: sha256:d0feb40ce8f53c8b55476d759b79766f8861baffaeaeab737b5887409d090640
2023/08/23 19:43:27  info unpack layer: sha256:90b8907a1fe6017f60edffefa0f7d6eff858681ee68cfab29cb7e6996a318d4c
2023/08/23 19:43:27  info unpack layer: sha256:7107a82f77ddc23d469110bb1c81942808b5ecf69f7dc88b3d07785fa740800e
2023/08/23 19:43:27  info unpack layer: sha256:70ed568596c605b4f6ccaea377b2d5fb85fe7a904a3457e77cc3354631ae94ba
2023/08/23 19:43:32  info unpack layer: sha256:f0850a606257c6f3917b854cf51691764b2eb20da1b478e4cb144ecf8a25eb00
INFO:    Creating SIF file...
[mkandes@exp-7-59 ~]$ ls
]         24578139            projects           training
24384178  benchmarks          scripts            tritonserver_23.07-py3.sif
24497385  data                slurm-24563363.sh
24565320  namd_3.0-beta2.sif  software
[mkandes@exp-7-59 ~]$ rmn slurm-24563363.sh 
bash: rmn: command not found
[mkandes@exp-7-59 ~]$ rm slurm-24563363.sh 
[mkandes@exp-7-59 ~]$ mv *.sif /expanse/lustre/scratch/mkandes/temp_project/
[mkandes@exp-7-59 ~]$ exit
exit
[mkandes@login01 ~]$ ssh mgr1
PIN+Yubi: 
SSH: Server;LType: Throughput;Remote: 10.21.0.21-22;IN: 0;OUT: 0;Duration: 0.0;tPut_in: -nan;tPut_out: -nan
Connection closed by 10.21.0.21 port 22
[mkandes@login01 ~]$ ovv818geffgefecccccchkvrtejejbetlekreglgbhkhreehvjbutl
-bash: ovv818geffgefecccccchkvrtejejbetlekreglgbhkhreehvjbutl: command not found
[mkandes@login01 ~]$ ssh mgr1
PIN+Yubi: 
Welcome to Bright release         9.0

                                                         Based on Rocky Linux 8
                                                                    ID: #000002

--------------------------------------------------------------------------------

                                 WELCOME TO
                  _______  __ ____  ___    _   _______ ______
                 / ____/ |/ // __ \/   |  / | / / ___// ____/
                / __/  |   // /_/ / /| | /  |/ /\__ \/ __/
               / /___ /   |/ ____/ ___ |/ /|  /___/ / /___
              /_____//_/|_/_/   /_/  |_/_/ |_//____/_____/

--------------------------------------------------------------------------------

Use the following commands to adjust your environment:

'module avail'            - show available modules
'module add <module>'     - adds a module to your environment for this session
'module initadd <module>' - configure module to be loaded at every login

-------------------------------------------------------------------------------
Last failed login: Wed Aug 23 19:48:49 PDT 2023 from 10.21.0.19 on ssh:notty
There was 1 failed login attempt since the last successful login.
Last login: Wed Aug 23 19:41:36 2023 from 10.21.0.19
Could not chdir to home directory /home/mkandes: No such file or directory
[mkandes@mgr1 /]$ cd /cm/shared/apps/containers/singularity/
[mkandes@mgr1 singularity]$ ls
alphafold  ciml     imagemagick  pytorch  tensorflow  visit
anaconda   e4s      namd         spark    trinity     wgs
centos     excerpt  paraview     tapis    ubuntu
[mkandes@mgr1 singularity]$ cd namd/
[mkandes@mgr1 namd]$ cp -rp /expanse/lustre/scratch/mkandes/temp_project/namd_3.0-beta2.sif ./
[mkandes@mgr1 namd]$ ls
namd_3.0-beta2.sif
[mkandes@mgr1 namd]$ sha256sum namd_3.0-beta2.sif 
33dd00a2d5f7dbc0b162d10e0f19039ab8d3552e72a9586f6d28f0ab24ba3d79  namd_3.0-beta2.sif
[mkandes@mgr1 namd]$

We may want to construct a new example around the container as we don't yet have Spack-based support for v3.x.

nwolter commented 1 year ago

1GPU OK.