statgen / SLURM-examples

85 stars 28 forks source link

Add array job example #7

Closed schelcj closed 6 years ago

schelcj commented 6 years ago

just some short examples of array job batch scripts.

pjvandehaar commented 6 years ago

% for concurrency is great. That should probably also be in main README.

Also, I like the idea of putting this in its own folder.

I don't understand this line in job-with-cmd-file.sh:

srun $(head -n $SLURM_ARRAY_TASK_ID cmds.txt | tail -n 1)

My understanding is: running sbatch job-with-cmd-file.sh will create 20 tasks, where the first 10 have 2GB RAM but use almost none of it, and the next 10 each get the default RAM and actually run the scripts from cmds.txt.

  1. I run sbatch job-with-cmd-file.sh.
  2. SLURM sends out ten tasks (probably to ten different nodes) which each get 2GB RAM and one cpu.
  3. Each of those ten tasks uses a few MB RAM and ~1% CPU to run srun. They stay alive idleing their cpus until the sruned commands terminate.
  4. Each of those tens srun commands starts another task with the default amount of RAM (probably on some other node).

Is that correct? Or, what am I missing?

schelcj commented 6 years ago

That will create 10 jobs each with 1 core and 2GB of memory. Each array task is only ever going to run a single line from the cmds.txt file. The head call gives you the number of lines up to the task id then you slice off the last line with tail. You can slice a single line out with sed as well but I've been inconsistent results with complex commands in the cmds.txt file.

pjvandehaar commented 6 years ago

That makes sense, and I like that approach, but why use srun instead of sh -c or eval (as recommened by dtaliun here)?

pjvandehaar commented 6 years ago

Oh, it sounds like I was wrong about what srun does inside of an sbatched script. This seems fine.

schelcj commented 6 years ago

When invoked within a job allocation srun will inherit the options from sbatch or salloc and this represents a job step within slurm. There can be multiple invocations of srun within a batch script to create more job steps. This also has the benefit of allowing you to inspect the resource utilization of a running job step with the sstat command. there is no need to eval the command once wrapped in $() in bash in most cases.

pjvandehaar commented 6 years ago

This seems much better than https://github.com/statgen/SLURM-examples/blob/master/job-array-one-command-per-line.sh which I contributed.

schelcj commented 6 years ago

Yes, I prefer this method to the bash array method.