JaneliaSciComp / BigStitcher-Spark

Running compute-intense parts of BigStitcher distributed
BSD 2-Clause "Simplified" License
17 stars 9 forks source link

using bigstitcher with slurm #20

Open ededits opened 6 months ago

ededits commented 6 months ago

Hi everyone!

I am a sysadmin, trying to help our users run bigstitcher on the HPC cluster. I don't necessarily know how BigStitcher works or what it does, and I am also not too familiar with spark. I was hoping that you could give us a few pointers to how to run this in a distributed mode in slurm?

Here is how I currently run it within a single node

#!/bin/bash
#
#SBATCH --job-name=bs-test # give your job a name
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=8
#SBATCH --cpus-per-task=1
#SBATCH --ntasks=8
#SBATCH --time=00:30:00
#SBATCH --mem=16GB

module purge
module load bigstitcher-spark/20231220
affine-fusion -x /software/apps/bigstitcher-spark/20231220/example/test/dataset.xml \
        -o ./test-spark.n5 \
        -d '/ch488/s0' \
        --UINT8 \
        --minIntensity 1 \
        --maxIntensity 254 \
        --channelId 0

How can I tell affine-fusion to distribute across multiple compute nodes (once I request multiple nodes from slurm).

Thanks!

trautmane commented 5 months ago

Hi Eugene,

In May 2022 for an I2K Workshop, @StephanPreibisch talked for 90 minutes about BigStitcher and I talked for 20 minutes about BigStitcher-Spark and the spark-janelia script library. Depending upon your interest/available time, that YouTube recording might be helpful. Stephan also has a bunch of other Big-Stitcher HowTo videos listed here.

If you'd prefer to go right to code/scripts on GitHub, we run Spark on Janelia's LSF cluster using this script library. Last summer, I worked with folks at MDC Berlin to adapt those scripts to support runs on the SGE/Univa. We don't currently have anything for Slurm, though the core concepts/ideas are likely similar.

If you are interested in adding support for Slurm to the spark-janelia scripts, I'm happy to help you integrate that - but I would need you to provide the core pieces since we don't use Slurm at Janelia and I don't have access to a Slurm cluster. Others (maybe @martinschorb at EMBL?) have also likely solved the problem of setting up a Spark cluster on top of a Slurm HPC cluster.

Sorry I don't have a direct solution for you, Eric

martinschorb commented 5 months ago

Hi, we are running spark on a slurm cluster. There is no user auth or specific solutions for avoiding networking conflicts when jobs are set up on identical nodes. Also, we just realized that with the current version of slurm, we run into errors with the job configuration. This is currently under investigation with HackMD.

If you want t have a look, here is how the submission script looks like:

#!/bin/bash
#SBATCH --job-name=spark-master      # create a short name for your job
#SBATCH --time=00:10:00         # total run time limit (HH:MM:SS)
#SBATCH -o sparkslurm-%j.out
#SBATCH -e sparkslurm-%j.err
# #  --- Master resources ---
#SBATCH --mem-per-cpu=2G
#SBATCH --cpus-per-task=1
#SBATCH --ntasks-per-node=1
# # --- Worker resources ---
#SBATCH hetjob
#SBATCH --job-name spark-worker
#SBATCH --nodes=4
#SBATCH --mem-per-cpu=4G
#SBATCH --cpus-per-task=8
#SBATCH --ntasks-per-node=1

# import Parameters

module load Java

export DISPLAY=""
export LOGDIR=`pwd`

# this is where spark is installed.
# modify the directories if you call the spark executables from a module
export SPARK_HOME=$YOURSPARKDIR

JOB="$SLURM_JOB_NAME-$SLURM_JOB_ID"
export MASTER_URL="spark://$(hostname):7077"
export MASTER_HOST=`hostname`
export MASTER_IP=`host $MASTER_HOST | sed 's/^.*address //'`
export MASTER_WEB="http://$MASTER_IP:8080"

mkdir $LOGDIR
mkdir $LOGDIR/$JOB

# SET UP ENV for the spark run

echo $MASTER_IP > $LOGDIR/$JOB/master

export SPARK_LOG_DIR="$LOGDIR/$JOB/logs"
export SPARK_WORKER_DIR="$LOGDIR/$JOB/worker"
export SPARK_LOCAL_DIRS="$TMPDIR/$JOB"

export SPARK_WORKER_CORES=$SLURM_CPUS_PER_TASK_HET_GROUP_1

export TOTAL_CORES=$(($SPARK_WORKER_CORES * $SLURM_JOB_NUM_NODES_HET_GROUP_1))

# export SPARK_DRIVER_MEM=$((4 * 1024))

export SPARK_MEM=$(( $SLURM_MEM_PER_CPU_HET_GROUP_1 * $SLURM_CPUS_PER_TASK_HET_GROUP_1))m
export SPARK_DAEMON_MEMORY=$SPARK_MEM
export SPARK_WORKER_MEMORY=$SPARK_MEM

# MAIN CALLS
#======================================

# start MASTER

$SPARK_HOME/sbin/start-master.sh

# wait for master to start

wait=1

while [ $wait -gt 0 ]
  do
    { # try
      curl "$MASTER_WEB" > /dev/null && wait=0 && echo "Found spark master, will submit tasks."
    } || { # catch
      sleep 10 && echo "Waiting for spark master to become available."
    }
  done

#echo Starting slaves
srun --het-group=1 $SPARK_HOME/bin/spark-class org.apache.spark.deploy.worker.Worker $MASTER_URL -d $SPARK_WORKER_DIR &

# again, sleep a tiny little bit
sleep 5s

# this is the general call we use together with command line parameters.

# sparksubmitcall="$SPARK_HOME/bin/spark-submit --master $MASTER_URL --driver-memory 2g --conf spark.default.parallelism=$TOTAL_CORES --conf spark.executor.cores=$SPARK_WORKER_CORES --executor-memory $SPARK_MEM --class $CLASS $JARFILE $PARAMS"

# this is the spark example to compute Pi

sparksubmitcall="$SPARK_HOME/bin/run-example --master $MASTER_URL --driver-memory 2g --conf spark.default.parallelism=$TOTAL_CORES --conf spark.executor.cores=$SPARK_WORKER_CORES --executor-memory $SPARK_MEM SparkPi"

echo $sparksubmitcall
$sparksubmitcall

# this keeps the master alive.
# You can also have the compute job write a file once done and exit the job upon its existance

sleep infinity
winnubstj commented 4 months ago

Just wanted to shout out the excellent nextflow-spark repo which allows you to start a spark cluster on slurm (or kubernetes aws, anything supported by nextflow)

StephanPreibisch commented 4 months ago

@trautmane also helped @bellonet at the MDC Berlin to set up Spark on their cluster ... not sure if she can help with some insights as well?

StephanPreibisch commented 4 months ago

I also started to write better documentation on BigStitcher-Spark https://github.com/JaneliaSciComp/BigStitcher-Spark (that also links the YouTube video). It would be great if people could contribute small HowTo's for how they set it up on their respective clusters: https://github.com/JaneliaSciComp/BigStitcher-Spark#installcluster to help other users ...

Eddymorphling commented 3 months ago

@StephanPreibisch I am on the same boat here trying to setup BigStitcher-Spark on our LSF cluster. A really naive question to start, can I run the "Define Dataset" step were I can distribute workload across different nodes and directly resaving TIFF files into N5/HDF5 rather than using an an already BigStitcher created XML+HDF5 pair? Or is BigStitcher-Spark only compatible with the next steps (i.e stitching, align, interest-points and fusion).