Closed JChRoy closed 1 month ago
Hi Jean-Charles,
The unexpected termination of anat_sphere
only happens when attempting to run the entire process, likely due to anat_sphere
running out of time as you set with_gpu... --time=00:30:00
. You can try to allocate a sufficient number of time slots on both CPU and GPU.
Good luck :)
Hi Irene,
Actually, the problem was due to inconsistency in the assignment of jobs between the SLURM script and the nextflow configuration file. Nextflow was assigning jobs to a partition which may not have sufficient resources to perform the command, as you suggested.
Setting the correct partition to the queue
parameter in the configuration file resolves the issue.
Thanks a lot for your answer. Best, Jean-Charles
Dear pBFSLab team,
Thanks a lot for this new pipeline. I am trying to run it from the singularity image on our HPC, which works well with allocation of resources for each process till the anat_sphare (mris_sphere) with the following error message:
ERROR ~ Error executing process > 'anat_wf:anat_sphere (sub-1001)' Caused by: Process anat_wf:anat_sphere (sub-1001) terminated for an unknown reason -- Likely it has been terminated by the external system
On the sacct report, there is no sign of excessive memory issue.I also tried different combinations of parameters on the sbatch shell and on a single node, adjusting the mem-per-cpu or mem-per-gpu (on a gpu partition), with the same error.
On the other hand, executing directly mris_sphere from the singularity image provides the expected outputs. Do you have any idea how to fix this issue?
Best, Jean-Charles
The script I launch from the login node:
sbatch deepprep_gpu.sh
The config_file: deepprep.slurm.gpu.config
Here is the .nextflow.log output