bioinfologics / satsuma2

FFT cross-correlation based synteny aligner, (re)designed to make full use of parallel computing
41 stars 13 forks source link

SatsumaSynteny2 wrapper killed on slurm #19

Closed kushalsuryamohan closed 4 years ago

kushalsuryamohan commented 5 years ago

Hello, I am a new user trying to generate synteny between two 1.5 Gb genomes. I installed Satsuma2 and ran the test script locally but I am trying to launch the actual synteny analysis on a SLURM cluster.

Here is my satsuma_run.sh script where I uncommented the code to launch Satsuma on SLURM:


# Script for starting Satsuma jobs on different job submission environments
# One section only should be active, ie. not commented out

# Usage: satsuma_run.sh <current_path> <kmatch_cmd> <ncpus> <mem> <job_id> <run_synchronously>
# mem should be in Gb, ie. 100Gb = 100

# no submission system, processes are run locally either synchronously or asynchronously
if [ "$6" -eq 1 ]; then
  eval "$2"
else
  eval "$2" &
fi

##############################################################################################################
## For the sections below you will need to change the queue name (QueueName) to one existing on your system ##
##############################################################################################################

# qsub (PBS systems)
#echo "cd $1; $2" | qsub -V -qQueueName -l ncpus=$3,mem=$4G -N $5

# bsub (LSF systems)
#mem=`expr $4 + 1000`
#bsub -o ${5}.log -J $5 -n $3 -q QueueName -R "rusage[mem=$mem]" "$2"

# SLURM systems
echo "#!/bin/sh" > slurm_tmp.sh
echo srun $2 >> slurm_tmp.sh
sbatch -p medium -c $3 -J $5 -o ${5}.log --mem ${4}G slurm_tmp.sh

And here is my SatsumaSynteny2 command -

./SatsumaSynteny2 -q ../../synteny_snakes/Naja_v5_chromosomes.fasta -t ../../synteny_snakes/Cro_vir_chromosomes.fasta -o ../../wrapper_outdir/

I am not sure what the problem is but here is the output I get after submitting the above command -

SATSUMA: Welcome to SatsumaSynteny! Current date and time: 2019/08/23 16:18:33
Path for Satsuma2: '/home/suryamok/satsuma2/bin'
Executing ./SatsumaSynteny2
Killed

I would really appreciate it if I can get this resolved. Attached is the compute resource details of the cluster I have access to. Please advise if my parameters are incorrect. Many thanks!

jonwright99 commented 5 years ago

Hi Kushal,

It looks like your job is being killed due to lack of memory. You haven't included how many threads and RAM you are requesting but you should ask for around 500Gb on your 'himem' queue. The master job will chunk your query and target files, then run KMatch to determine exact matches to use as seeds. KMatch jobs will be spawned as separate SLURM jobs, each requesting 2 threads and 100Gb of RAM by default. You might need to increase this for large genomes using the -km_mem parameter. After KMatch, slaves will be spawned as SLURM jobs to compare the chunks and by default only 1 slave is spawned. You could try running 10 slaves, each requesting 8 threads and 500Gb of memory, so -slaves 10 -threads 8 -sl_mem 500. The parameter choice here depends on your hardware. Our HPC is configured in blocks each with 8 cores and 360Gb of local memory so we tend to set the parameters to run one slave per block (ie. 8 threads and 360Gb) but this is for optimum speed, other settings will work as long as you don't run out of memory. I hope this helps.

Best wishes, Jon