hpcugent / csub

Generate a wrapper script around DMTCP and the job submission system to auto checkpoint certain jobs.
GNU General Public License v3.0
2 stars 4 forks source link

csub on SLURM cluster #15

Open boegel opened 5 years ago

boegel commented 5 years ago

resubmit fails after first sub job, when job was submitted with csub -s script.sh:

Job resubmit succesful.
Job resubmit output: There was an error running the SLURM sbatch command.
The command was:
'/bin/sbatch -e /scratch/gent/400/vsc40000/chkpt/example.sh.20181215_191340.F9/example.sh.20181215_191340.F9.base.err -J example.sh.20181215_191340.F9 --dependency=afterok:7372787 -p -o /scratch/gent/400/vsc40000/chkpt/example.sh.20181215_191340.F9/example.sh.20181215_191340.F9.base.out /local/example.sh.20181215_191340.F9/checkpoint/base --chdir=/user/gent/400/vsc40000 --export=NONE --get-user-env=60L -o /tmp/example.sh.20181215_191340.F9/example.sh.20181215_191340.F9.o%A'
and the output was:
'sbatch: error: Batch script is empty!
'
end resubmit Sun Dec 16 05:04:17 CET 2018
EXITING BASE Sun Dec 16 05:04:17 CET 2018 7372787
boegel commented 5 years ago

The problem here is the -p in the sbatch command that gets executed by the qsub wrapper; -p is not a known option for sbatch, and so it assumes that -p points to the location of the job script to submit... :-/