Closed JohnHadish closed 2 years ago
If users are running on a back queue on a University Cluster, they may encounter being preempted by another user:
SLurm Example:
slurmstepd: error: *** JOB 32391783 ON cn119 CANCELLED AT 2021-06-18T13:05:36 DUE TO PREEMPTION ***
This will cause their run to end. To avoid this, the user can modify the nextflow.config parameter maxRetries:
nextflow.config
maxRetries
maxRetries = 3
Which will attempt to re-run jobs which failed, including those which have been preempted.
This information can be added to the Troubleshooting Section
Troubleshooting
I think this is something we can add to the docs to help user's out.
Fixed by recent PR.
If users are running on a back queue on a University Cluster, they may encounter being preempted by another user:
SLurm Example:
This will cause their run to end. To avoid this, the user can modify the
nextflow.config
parametermaxRetries
:Which will attempt to re-run jobs which failed, including those which have been preempted.
This information can be added to the
Troubleshooting
Section