NERSC / podman-hpc

Other
38 stars 10 forks source link

Expected --cuda-mpi injection does not work appropriately #110

Closed gzquse closed 4 months ago

gzquse commented 4 months ago

Hi team,

This is our steps:

  1. podman-hpc build -t cudaq-mpich:test -f Dockerfile.cudaq-mpich .
  2. podman-hpc migrate cudaq-mpich:test
  3. podman-hpc run -it --mpi --gpu cudaq-mpich:test output: Error: no container with ID c788431eb058741b5bdab05bc5b20fd8c1bce5594326681c1a396a157441b366 found in database: no such container send_complete failed for None

remove mpi injection we can get in correctly

  1. podman-hpc run -it --gpu cudaq-mpich:test root@a2cce5da38ec:/tmp# exit

We do need to use cuda-awared mpi for our program.

Couple guesses:

  1. Seems "mpi injection" only supports nersc public registry and public docker
  2. Cuda awared mpi injection failed?

Here, we also have the automation script that makes the image public to everyone at NERSC.

#!/bin/bash
# Exit immediately if a command exits with a non-zero status
set -e

# Function to display usage information
usage() {
    echo "Usage: $0 -f <Dockerfile_path> -t <image_tag> [-p]"
    echo "     -p: (optional) publish image for all NERSC users"
    echo example
    echo "   ./podman-hpc-build.sh  -f Dockerfile.ubu24-xeyes -t ubu24-xeyes:p1c   -p   "

  exit 1
}

# Initialize variables
dockerfileName=""
imageName=""
use_p=false

# Parse command line arguments
while getopts ":f:t:p" opt; do
  case $opt in
    f)
      dockerfileName=$OPTARG
      ;;
    t)
      imageName=$OPTARG
      ;;
    p)
      use_p=true
      ;;
    \?)
      echo "Invalid option: -$OPTARG" >&2
      usage
      ;;
    :)
      echo "Option -$OPTARG requires an argument." >&2
      usage
      ;;
  esac
done

# Check if required arguments are provided
if [ -z "$dockerfileName" ] || [ -z "$imageName" ]; then
  usage
fi

# Check if the Dockerfile exists
if [ ! -f "$dockerfileName" ]; then
  echo "Error: Dockerfile '$dockerfileName' does not exist."
  exit 1
fi

if [[ "$imageName" == */* ]]; then
    echo "Error: imageName=$imageName     contains '/'"
    exit 2
fi

# check if image exist in repo already
if podman-hpc images --format "{{.Repository}}:{{.Tag}}" |grep -q $imageName ; then
    echo "Image $imageName exists, change the name"
    exit 3
fi

#  prefix image name with user name
imageName=$USER/$imageName
echo building image: $imageName ....

myProj=`sacctmgr show user $USER  withassoc format=DefaultAccount |tail -n1 | xargs`

# Display the arguments
echo "Dockerfile: $dockerfileName"
echo "Image name: $imageName"
echo "Use -p: $use_p"
echo "my NERSC project: $myProj"

# Execute the podman build command
time podman-hpc build -f "$dockerfileName" -t "$imageName"

echo 'podman-hpc images          # check image is visible'

# Execute additional commands if -p is used
if [ "$use_p" = true ]; then
    CORE_PUB="/cfs/cdirs/$myProj/$USER/podman_common/"
    echo CORE_PUB=$CORE_PUB
    podman-hpc --squash-dir "/global/$CORE_PUB" migrate "$imageName"
    chmod -R a+rx  /global/$CORE_PUB
    echo
    echo public image use example; echo
    #echo POD_PUB=/dvs_ro$CORE_PUB
    echo "export PODMANHPC_ADDITIONAL_STORES=/dvs_ro$CORE_PUB"
else
    echo private image use example; echo
    podman-hpc  migrate "$imageName"
fi

#... common
echo IMG=$imageName 
echo 'podman-hpc run -it --gpu -e DISPLAY  -v $HOME:$HOME -e HOME  $IMG  bash     # start the image'
echo

exit 0

This still does not work.

Expected result:

salloc -q interactive -C gpu -t 4:00:00 -A nintern -N 2
srun -n 2 podman-hpc run --rm --cuda-mpi --gpu cudaq-mpich:test "cmd" 

distributing in two nodes

gzquse commented 4 months ago

For normal mpi injection test:

srun -n 2 podman-hpc run -it --mpi --gpu cudaq-mpich:test python3 -m mpi4py.bench helloworld

Error: no container with name or ID "uid-105687-pid-1265444" found: no such container

danfulton commented 4 months ago

If you are running on Perlmutter, the --gpu and --mpi flags do not work together. The MPI library that's loaded is not compiled with CUDA. To get CUDA aware MPI, you should use the --cuda-mpi flag.

These options are all specific to the site modules installed at NERSC.

gzquse commented 4 months ago
salloc -q interactive -C gpu -t 4:00:00 -A nintern -N 2
srun -n 2 podman-hpc run -it --cuda-mpi --gpu cudaq-mpich:test2 python3 -m mpi4py.bench helloworld
srun: Job 28628103 step creation temporarily disabled, retrying (Requested nodes are busy)
srun: Step created for StepId=28628103.2
Error: can only create exec sessions on running containers: container state improper
Error: can only create exec sessions on running containers: container state improper
gzquse commented 4 months ago

ran without mpi injection, it works normally

srun -l -n 2 podman-hpc run -i  cudaq-mpich:test2 ls
0: =========================
0:       NVIDIA CUDA-Q      
0: =========================
0: 
0: Version: latest
0: 
0: Copyright (c) 2024 NVIDIA Corporation & Affiliates 
0: All rights reserved.
0: 
0: To run a command as administrator (user `root`), use `sudo <command>`.
0: 
0: mpi_comm_impl.o
0: mpich-4.1.1.tar.gz
1: =========================
1:       NVIDIA CUDA-Q      
1: =========================
1: 
1: Version: latest
1: 
1: Copyright (c) 2024 NVIDIA Corporation & Affiliates 
1: All rights reserved.
1: 
1: To run a command as administrator (user `root`), use `sudo <command>`.
1: 
1: mpi_comm_impl.o
1: mpich-4.1.1.tar.gz
gzquse commented 4 months ago
srun -n 2 podman-hpc run --rm --mpi registry.nersc.gov/library/nersc/mpi4py:3.1.3 python3 -m mpi4py.bench helloworld
Hello, World! I am process 0 of 2 on nid200413.
Hello, World! I am process 1 of 2 on nid200421.

The example is given by default setup and uses nersc registry. Do we have to upload to the public repo first?

Many thanks

danfulton commented 4 months ago

Hi,

Since your question is NERSC specific, please follow up in the support ticket that you already have open with NERSC.

Thank you!