Closed isukrit closed 6 years ago
This is most likely a memory error. How much memory does your Docker container have available?
Okay, it was 2 GB earlier and I have increased it to 28 GB. Should be fine now?
When I try running for multiple subjects (8 subjects) it again crashes.
CORRECTING DEFECT 0 (vertices=65900, convex hull=5139, v0=302) error in the retessellation normal vector of length zero at vertex 52286 with 0 faces vertex 52286 has 0 face XL defect detected... No such file or directory Linux aa6292a477f7 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
recon-all -s sub-0025405 exited with ERRORS at Tue Jul 17 14:44:45 UTC 2018
For more details, see the log file /out/freesurfer/sub-0025405/scripts/recon-all-lh.log To report a problem, see http://surfer.nmr.mgh.harvard.edu/fswiki/BugReporting
Standard error:
Return code: 1
Sentry is attempting to send 1 pending error messages Waiting up to 10 seconds Press Ctrl-C to quit fMRIPrep: Please report errors to https://github.com/poldracklab/fmriprep/issues recon-all-status-lh.log
@effigies, any clue? It runs fine for a single subject. I ran it with 2 subjects and it hasn't finished in 2 days. Stuck at resume recon-all
. I have allotted 14 CPUs, 28 GB memory and 2 GB swap to Docker.
Hi @isukrit ! I usually see this happen if I'm trying to restart a failed fMRIPrep job, or if I'm using a previously created recon-all directory. Since fMRIPrep will reuse existing surfaces, if something went wrong in creating or saving these then it will be stuck at this step.
I'd recommend checking those subjects's files in the freesurfer directory, and especially the generated logs listed in the error message: freesurfer/sub-0025405/scripts/recon-all-lh.log
You can, of course, just delete these directories and fMRIPrep will create them during its next running ! With the processing capabilities you listed, it should be able to complete.
Okay, now I have cleaned all the out directories and used the scratch option. I have run it with all the subjects using low memory option:
sudo fmriprep-docker --fs-license-file /Volumes/Storage/Students/Sukrit/LMU_Data/Freesurfer_license/license.txt /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_BIDS_3 /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_3_processed participant -w /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_3_inter_res --debug --low-mem
Let us see what happens.
Hi @isukrit, how has this gone for you? Also, just a note that --debug
is a developer flag that includes faster, less accurate registration. If you just want more output, use -v
(for verbose).
Thanks for chiming in @emdupre!
It ran for 3-4 days. Now it's frozen to the following output:
180727-23:08:29,368 nipype.interface INFO:
resume recon-all : recon-all -autorecon-hemi lh -noparcstats -nocortparc2 -noparcstats2 -nocortparc3 -noparcstats3 -nopctsurfcon -nohyporelabel -noaparc2aseg -noapas2aseg -nosegstats -nowmparc -nobalabels -openmp 8 -subjid sub-0025425 -sd /out/freesurfer
It has been almost 5 days since it's been like this. What log files can I attach you help you?
It's possible that there was a crash that failed to register. What version of fmriprep are you using?
The relevant log files would be: /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_3_processed participant/freesurfer/sub-0025425/scripts/recon-all*.log
recon-all-lh.log recon-all-status-lh.log recon-all-status.log recon-all.log
fmriprep version is 1.1.2
I think the issue is my hard disk is running out of space. Is there a way I can save the files produced by docker on an external disk (I have attached to my server)?
You can mount any directory into Docker, so you should be able to use the external hard drive as your output (or scratch) directories.
Depending on the speed of the bus (USB3 should get pretty close to internal bus speeds), you may get a performance penalty for using the external hard drive, and this will be more of an issue in the scratch directory than the output directory.
Okay, what I did was clear up space on my hard drive and ran fmriprep again. However, I am again stuck with a subject for which the processing is stuck for the past 4 days. Here is the output on the terminal:
180806-10:51:49,939 nipype.workflow INFO: [Node] Setting-up "_autorecon_surfs1" in "/scratch/fmriprep_wf/single_subject_0025425_wf/anat_preproc_wf/surface_recon_wf/autorecon_resume_wf/autorecon_surfs/mapflow/_autorecon_surfs1". 180806-10:51:50,47 nipype.interface INFO: resume recon-all : recon-all -autorecon-hemi rh -noparcstats -nocortparc2 -noparcstats2 -nocortparc3 -noparcstats3 -nopctsurfcon -nohyporelabel -noaparc2aseg -noapas2aseg -nosegstats -nowmparc -nobalabels -openmp 8 -subjid sub-0025425 -sd /out/freesurfer 180806-10:51:50,52 nipype.workflow INFO: [Node] Running "_autorecon_surfs1" ("nipype.interfaces.freesurfer.preprocess.ReconAll"), a CommandLine Interface with command: recon-all -autorecon-hemi rh -noparcstats -nocortparc2 -noparcstats2 -nocortparc3 -noparcstats3 -nopctsurfcon -nohyporelabel -noaparc2aseg -noapas2aseg -nosegstats -nowmparc -nobalabels -openmp 8 -subjid sub-0025425 -sd /out/freesurfer 180806-10:51:50,83 nipype.interface INFO: resume recon-all : recon-all -autorecon-hemi rh -noparcstats -nocortparc2 -noparcstats2 -nocortparc3 -noparcstats3 -nopctsurfcon -nohyporelabel -noaparc2aseg -noapas2aseg -nosegstats -nowmparc -nobalabels -openmp 8 -subjid sub-0025425 -sd /out/freesurfer
So, I have been running fmriprep for around 30 subjects for around a month now but I am still not done with the pre processing. However, when I preprocess each subject separately I am done in less than 4 hours. Do you suggest running fmriprep for each subject separately? Will the results be as good as running all of them together?
We definitely suggest running subjects separately. If you want to optimize (and you have a good number of cpu cores and a good amount of RAM), then check out these stats https://neurostars.org/t/how-much-ram-cpus-is-reasonable-to-run-pipelines-like-fmriprep/1086/5?u=oesteban
Seems like fMRIPrep goes well up to 4 participants per run. There are no tests beyond that, but I'd say that more than that will be in fact slower than running sets of 4 in parallel.
What are your settings? Running on a cluster or PC? CPUs, RAM...?
Thanks @oesteban. In Docker, I have 24 CPUs, 28 GB Memory, 2 GB Swap.
If I have to run it for each subject, how do I do it? Should I specify a participant label and run it again for each subject? I had raised this before as well: https://github.com/poldracklab/fmriprep/issues/1196
from subprocess import check_output, CalledProcessError
from tempfile import TemporaryFile
import os
import json
import shutil
import numpy as np
dataset = '3'
directory = './LMU_BIDS_' + dataset
for subject in os.listdir(directory):
if os.path.isdir(os.path.join(directory, subject)):
os.makedirs(os.path.dirname('LMU_' + dataset + '_processed/' + subject + '/'), exist_ok=True)
os.makedirs(os.path.dirname('LMU_' + dataset + '_inter_res/' + subject + '/'), exist_ok=True)
command = 'sudo fmriprep-docker --fs-license-file /Volumes/Storage/Students/Sukrit/LMU_Data/Freesurfer_license/license.txt /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_BIDS_' + dataset + ' /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_' + dataset + '_processed/' \
+ str(subject) + ' --participant_label ' + str(subject) + ' -w /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_' + dataset + '_inter_res/' + str(subject) +' -v --low-mem'
print (command)
with TemporaryFile() as t:
try:
out = check_output(command, stderr=t, shell=True)
status = 0
print ('********************************Done with subject:', subject,' ************************************************')
print (out)
except CalledProcessError as e:
t.seek(0)
print('****************************************************************************************************************')
print ('********************************** Error for subject: ', subject, ' is: ', e.returncode, t.read(), '************************************************')
A script like this should work?
I want to process each subject separately, one by one.
I would go for something in bash, to avoid one more layer of python subprocesses.
#!/bin/bash
for sub in $(find ./LMU_BIDS_/3/ -maxdepth 1 -name 'sub-*' -type d ); do
part_label=$( basename $sub )
# I would configure docker to be run as a regular user
sudo fmriprep-docker --fs-license-file /Volumes/Storage/Students/Sukrit/LMU_Data/Freesurfer_license/license.txt /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_BIDS_/3 /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_BIDS_/3/derivatives/fmriprep_1.1.4 participant --participant_label ${part_label} -w /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_3/_work/${part_label} -v --low-mem
Some good practices and fixes over the code you suggested:
-w
when you have everything working and you will not need the intermediate results anymore. Otherwise you'll need to clean it up frequently.Okay, I will look for something in bash.
The key difference in our understanding is that I have created a separate directory for every subject for fmriprep's output. This is because fmriprep is not able to run in the same directory once I ran it for a single subject. That's why I opened this issue.
However, I will answer your queries. I wasn't using the participant label mistakenly, I purposely placed the output of every run in a separate folder given by participant label. I will add the 'participant' key word, though.
I used the -w
because then I don't have to start the whole process again if it crashes midway. Is that right?
I wasn't using the participant label mistakenly, I purposely placed the output of every run in a separate folder given by participant label. I will add the 'participant' key word, though.
As all BIDS Apps, fmriprep takes three mandatory parameters:
fmriprep <data_dir> <output_dir> <analysis_level>
were <analysis_level>
can be either participant
, group
or both participant group
. For fmriprep there is not a group level, so <analysis_level>
can only be participant
.
In the script you posted, you had + str(subject) + ' --participant_label ' + str(subject) +
. The first str(subject)
was incorrectly placed. You wanted to have + ' participant --participant_label ' + str(subject) +
there, and I fixed that in my suggestion. You'll see in my suggestion that the --participant-label
argument is set to each subject in the for loop.
I used the -w because then I don't have to start the whole process again if it crashes midway. Is that right?
That's correct, and I suggested a command line keeping this.
I have edited my previous comment. Please have a look. :)
The first str(subject) was incorrectly placed.
It was actually a part of the output folder path. However, I did miss the participant label. Let me edit this. I also don't want my outputs in the input directory and have accordingly edited the paths. I know it shouldn't be a problem, but I just want to be sure, since I have lost a lot of time on this already. Here is the final bash script I prepared to run this analysis:
#!/bin/bash
for sub in $(find ./LMU_BIDS_3/ -maxdepth 1 -name 'sub-*' -type d ); do
part_label=$( basename $sub )
echo " ************************* Start Processing Subject: "${part_label} " ************************* "
mkdir -p ./derivatives/fmriprep_1.1.4/${part_label}
mkdir -p ./_work/${part_label}
sudo fmriprep-docker --fs-license-file /Volumes/Storage/Students/Sukrit/LMU_Data/Freesurfer_license/license.txt /Volumes/Storage/Students/Sukrit/LMU_Data/LMU_BIDS_3 /Volumes/Storage/Students/Sukrit/LMU_Data/derivatives/fmriprep_1.1.4/${part_label} participant --participant_label ${part_label} -w /Volumes/Storage/Students/Sukrit/LMU_Data/_work/${part_label} -v --low-mem
echo " %%%%%%%%%%%%%%%%%%%%%%%%% End Processing Subject: "${part_label} " %%%%%%%%%%%%%%%%%%%%%%%%%%%%%% "
done
I think those mkdir -p
are not strictly necessary, but I guess they don't hurt.
If you want to optimize your time, I'd first echo the command lines without executing. Have a look that all are fine, and run one manually. Check whether it works and then trigger all of them.
Make sure you have lots of space in your -w
directory.
Running! Will keep you posted.
I think this issue can be deemed as fixed. If you encounter any new problems, please open a new one. If you run on any of the issues discussed here, then feel free to reopen whenever you need.
Glad it worked out!
Error for single subject again. The terminal output looks like this:
ERROR: could not open stats/wmparc.stats for writing
Errno: No such file or directory
Linux b2d8a0016bdf 4.9.87-linuxkit-aufs #1 SMP Wed Mar 14 15:12:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
recon-all -s sub-0025402 exited with ERRORS at Tue Aug 14 12:10:33 UTC 2018
For more details, see the log file /out/freesurfer/sub-0025402/scripts/recon-all.log
To report a problem, see http://surfer.nmr.mgh.harvard.edu/fswiki/BugReporting
Standard error:
Return code: 1
180814-12:10:34,825 nipype.workflow INFO:
[MultiProc] Running 1 tasks, and 0 jobs ready. Free memory (GB): 19.69/24.69, Free processors: 16/24.
Currently running:
* _autorecon31
180814-12:19:09,560 nipype.workflow INFO:
[Node] Finished "_autorecon31".
180814-12:19:11,401 nipype.workflow INFO:
[Job 301] Completed (_autorecon31).
180814-12:19:11,403 nipype.workflow INFO:
[MultiProc] Running 0 tasks, and 0 jobs ready. Free memory (GB): 24.69/24.69, Free processors: 24/24.
180814-12:19:13,409 nipype.workflow INFO:
***********************************
180814-12:19:13,409 nipype.workflow ERROR:
could not run node: fmriprep_wf.single_subject_0025402_wf.anat_preproc_wf.surface_recon_wf.autorecon_resume_wf.autorecon3
180814-12:19:13,409 nipype.workflow INFO:
crashfile: /out/fmriprep/sub-0025402/log/20180814-060511_ad2afafe-7835-4725-9120-9bb44cbae7fa/crash-20180814-121034-root-_autorecon30-dd73a49d-7f2c-405d-853c-370b294d6353.txt
180814-12:19:13,411 nipype.workflow INFO:
***********************************
Errors occurred while generating reports for participants: 0025402 (1).
Before this there are following errors in the terminal:
180814-12:10:33,887 nipype.workflow WARNING:
[Node] Error on "_autorecon30" (/scratch/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/surface_recon_wf/autorecon_resume_wf/autorecon3/mapflow/_autorecon30)
180814-12:10:34,809 nipype.workflow ERROR:
Node _autorecon30 failed to run on host b2d8a0016bdf.
180814-12:10:34,814 nipype.workflow ERROR:
Saving crash info to /out/fmriprep/sub-0025402/log/20180814-060511_ad2afafe-7835-4725-9120-9bb44cbae7fa/crash-20180814-121034-root-_autorecon30-dd73a49d-7f2c-405d-853c-370b294d6353.txt
Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
result['result'] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 480, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 564, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 644, in _run_command
result = self._interface.run(cwd=outdir)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 521, in run
runtime = self._run_interface(runtime)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 1033, in _run_interface
self.raise_exception(runtime)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 970, in raise_exception
).format(**runtime.dictcopy()))
RuntimeError: Command:
recon-all -autorecon3 -hemi lh -openmp 8 -subjid sub-0025402 -sd /out/freesurfer -nosphere -nosurfreg -nojacobian_white -noavgcurv -nocortparc -nopial
Standard output:
Subject Stamp: freesurfer-Linux-centos6_x86_64-stable-pub-v6.0.1-f53a55a
Current Stamp: freesurfer-Linux-centos6_x86_64-stable-pub-v6.0.1-f53a55a
INFO: SUBJECTS_DIR is /out/freesurfer
Actual FREESURFER_HOME /opt/freesurfer
I have attached the log file. What am I doing wrong? recon-all.log
For some reason, freesurfer is unable to write under /Volumes/Storage/Students/Sukrit/LMU_Data/derivatives/fmriprep_1.1.4/sub-0025402/freesurfer/sub-0025402/stats
.
Can you please check on that folder?
@oesteban, I updated fmriprep from 1.1.2 to 1.1.4 (latest version?) and ran it again. It ran successfully for the subject.
Hi, I had run the workflow for a single subject but ran into the following error. The command I had used was:
sudo fmriprep-docker --fs-license-file /some_dir/Freesurfer_license/license.txt /some_dir/LMU_BIDS_3 /some_dir/LMU_3_processed participant --participant_label sub-0025402
Error on "fmriprep_wf.single_subject_0025402_wf.anat_preproc_wf.anat_template_wf.t1_merge" (/root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/t1_merge) Traceback (most recent call last): File "/usr/local/miniconda/bin/fmriprep", line 11, in
sys.exit(main())
File "/usr/local/miniconda/lib/python3.6/site-packages/fmriprep/cli/run.py", line 316, in main
fmriprep_wf.run(plugin_settings)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/workflows.py", line 595, in run
runner.run(execgraph, updatehash=updatehash, config=self.config)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 162, in run
self._clean_queue(jobid, graph, result=result))
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/base.py", line 224, in _clean_queue
raise RuntimeError("".join(result['traceback']))
RuntimeError: Traceback (most recent call last):
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
result['result'] = node.run(updatehash=updatehash)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 480, in run
result = self._run_interface(execute=True)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 564, in _run_interface
return self._run_command(execute)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 644, in _run_command
result = self._interface.run(cwd=outdir)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/freesurfer/base.py", line 275, in run
return super(FSCommandOpenMP, self).run(inputs)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/freesurfer/base.py", line 144, in run
return super(FSCommand, self).run(inputs)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 521, in run
runtime = self._run_interface(runtime)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 1033, in _run_interface
self.raise_exception(runtime)
File "/usr/local/miniconda/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 970, in raise_exception
).format(runtime.dictcopy()))
RuntimeError: Command:
mri_robust_template --satit --fixtp --mov /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct0/sub-0025402_ses-01_T1w_ras_corrected.nii.gz /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct1/sub-0025402_ses-02_T1w_ras_corrected.nii.gz --inittp 1 --iscale --noit --template sub-0025402_ses-01_T1w_ras_template.nii.gz --subsample 200 --lta /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/t1_merge/tp1.lta /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/t1_merge/tp2.lta
Standard output:
$Id: mri_robust_template.cpp,v 1.54 2016/05/05 21:17:08 mreuter Exp $
--satit: Will estimate SAT iteratively! --fixtp: Will map everything to init TP! --mov: Using /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct0/sub-0025402_ses-01_T1w_ras_corrected.nii.gz as movable/source volume. --mov: Using /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct1/sub-0025402_ses-02_T1w_ras_corrected.nii.gz as movable/source volume. Total: 2 input volumes --inittp: Using TP 1 as target for initialization --iscale: Enableing intensity scaling! --noit: Will output only first template (no iterations)! --template: Using sub-0025402_ses-01_T1w_ras_template.nii.gz as template output volume. --subsample: Will subsample if size is larger than 200 on all axes! --lta: Will output LTA transforms Setting iscale ... reading source '/root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct0/sub-0025402_ses-01_T1w_ras_corrected.nii.gz'... converting source '/root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct0/sub-0025402_ses-01_T1w_ras_corrected.nii.gz' to bspline ... MRItoBSpline degree 3 reading source '/root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct1/sub-0025402_ses-02_T1w_ras_corrected.nii.gz'... converting source '/root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct1/sub-0025402_ses-02_T1w_ras_corrected.nii.gz' to bspline ... MRItoBSpline degree 3
MultiRegistration::initializing Xforms (init 1 , maxres 0 , iterate 5 , epsit 0.01 ) :
[init] ========================= TP 2 to TP 1 ============================== Register TP 2 ( /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct1/sub-0025402_ses-02_T1w_ras_corrected.nii.gz ) to TP 1 ( /root/src/fmriprep/work/fmriprep_wf/single_subject_0025402_wf/anat_preproc_wf/anat_template_wf/n4_correct/mapflow/_n4_correct0/sub-0025402_ses-01_T1w_ras_corrected.nii.gz )
Standard error: Killed Return code: 137
Sentry is attempting to send 1 pending error messages Waiting up to 10 seconds Press Ctrl-C to quit fMRIPrep: Please report errors to https://github.com/poldracklab/fmriprep/issues