Closed druvus closed 8 years ago
PipelineRunFailed
is raised when poll_jobs
of the current executor returns JOB_FAILED
. SlurmExecutorMixin.poll_jobs
parses the output from SLURM command sacct -j <job_id>
and fails if the status simply is failed (see here) or the job was exited if not running on SLURM (see here).
In this case it is SLURM which returned that the job failed for some reason. Do you have any logs from the SLURM-run which might help to narrow down where and why it failed? You could also inspect the resulting shell-file which is executed at each step to see if it was created correctly, there should be file named something like <stepname>_exp_<experiment name/number.sh>
in your working directory I think.
You are right. It is not a doepipeline issue. The script is running successfully but slurm thinks it is failing. I will close and try to understand why it fails.
The reason for the problem was that the LINKS author used "die and exit 1" code in the perl code when the script successfully finished. The scaffolding test case is running nicely at UPPMAX after updating the LINKS perl code.
I have some problem to get my pipeline to work correctly using slurm. The same pipeline works nicly using local executor in serial mode. Using Uppmax (/proj/nobackup/b2015353/scaffolding/) with the those files.
The output indicates that the job failed but it seems that it finished correctly.
Not sure what I did wrong so I would be helpful with advice how to fix it.