Open Heechberri opened 3 years ago
I'm looking at this and not seeing anything obviously wrong. I am a bit confused by some of these choices:
--use-plugin /plugin.yml --n_cpus '4' --nthreads 2 --omp-nthreads 4 --mem 20GB --low-mem \
--n_cpus
and --nthreads
are the same parameter. Looking at your output, --nthreads 2
won and you can only use up to two cores, but --omp-nthreads
means that multi-threaded jobs will claim four cores. My guess is what's happening is that any job that actually tries to use 4 cores can't be scheduled since you say you only have 2.
I would recommend not setting --n_cpus
, --nthreads
or --omp-nthreads
. The default will be to use 5 cores and up to 4 per job.
Thanks for the recommendations, I will stop the current run, change the inputs and re-run according to this, will update if it gets done :)
@Heechberri any update to this issue? I am running into an error on the same step, and I suspect it has to do with resource allocation. Whereas most people seem to run FMRIPREP on a cluster, I'm running it on a lab server and having difficulty electing where to limit resources to parallelize single-subject runs.
Hi all,
I have been trying to trouble shoot this problem for a week now, and would like some input from the experts.
I am running one subject as a test for a up coming pipeline using the following commands:
and the process has been stalling at the step:
for some time now. I have re-ran the docker container a couple of times both with fresh directories and resuming from scratch directory. I have waited from an overnight to 3 days , but it always seems to stall at the above mentioned step. Freesurfer has completed (from the recon-all.log in the freesurfer scripted directory) in expected time, but the functional outputs are stalling at this step. Correct me if I am wrong, given what I have read, I assume the rest of the steps outside of freesurfer should complete within 12 hours, even with the limited resources that I am using (docker allocation: CPU 5, mem 25GB, Swap:3.5, Disk:320GB).
I suspect that the problem has to do with resource allocation thus after reading through some of the post here and neurostars, I decided to add the following memory management flags in fmriprep:
Still fmriprep has been stalling for the past 12 hours. Previously I have also tried running without the nthreads flag at 4 threads and it stalled overnight as well. Is there anything else I can do?
My images are multiband and attached is the full terminal log with flag -vvvv.
In the terminal log file, I only copied in the first few instances of "cannot allocate job..." because the flag -vvvvprints out too many debugging messages. I have left the process to run for more than 12 hours . It has been printing out "cannot allocate job..." every few seconds since then.
Also, I do not have any bids errors.
Thank you!
terminal log.txt