poldracklab / tacc-openneuro

0 stars 1 forks source link

fmriprep vs23.0.0 fsaverage issues #62

Closed jbwexler closed 1 year ago

jbwexler commented 1 year ago

I haven't actually tested these datasets with a previous version so it's just a guess that it's related to the v23.0.0. I checked my scripts and I don't do anything weird anymore with the fsaverage directories (I used to pre-load them prior to running).

ds000030 is giving me the following error. 'fsaverage/label/lh.aparc.a2005s.annot' is indeed not present.

Node: fmriprep_23_0_wf.fsdir_run_20230324_115527_6073ab8c_cf13_4c3a_8397_f9a0b46f1290 Working directory: /scratch1/03201/jbwexler/work_dir/fmriprep/ds000030_sub-10249/fmriprep_23_0_wf/fsdir_run_20230324_115527_6073ab8c_cf13_4c3a_8397_f9a0b46f1290

Node inputs:

derivatives = /scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds000030-fmriprep freesurfer_home = /opt/freesurfer minimum_fs_version = 7.0.0 overwrite_fsaverage = False spaces = ['fsaverage5', 'fsaverage'] subjects_dir = /scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds000030-fmriprep/sourcedata/freesurfer

Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/plugins/multiproc.py", line 344, in _send_procs_to_workers self.procs[jobid].run(updatehash=updatehash) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run result = self._run_interface(execute=True) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface return self._run_command(execute) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command raise NodeExecutionError(msg) nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20230324_115527_6073ab8c_cf13_4c3a_8397_f9a0b46f1290.

Traceback: Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 398, in run runtime = self._run_interface(runtime) File "/opt/conda/lib/python3.9/site-packages/niworkflows/interfaces/bids.py", line 962, in _run_interface shutil.copytree(source, dest, copy_function=shutil.copy) File "/opt/conda/lib/python3.9/shutil.py", line 566, in copytree return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, File "/opt/conda/lib/python3.9/shutil.py", line 522, in _copytree raise Error(errors) shutil.Error: [('/opt/freesurfer/subjects/fsaverage/label/lh.aparc.a2005s.annot', '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds000030-fmriprep/sourcedata/freesurfer/fsaverage/label/lh.aparc.a2005s.annot', "[Errno 2] No such file or directory: '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds000030-fmriprep/sourcedata/freesurfer/fsaverage/label/lh.aparc.a2005s.annot'")]

And ds002843 is giving me this error:

Node: fmriprep_23_0_wf.fsdir_run_20230324_121852_ecc545d1_9b63_4c9e_b1c4_d285e9679098 Working directory: /scratch1/03201/jbwexler/work_dir/fmriprep/ds002843_sub-dmp0010/fmriprep_23_0_wf/fsdir_run_20230324_121852_ecc545d1_9b63_4c9e_b1c4_d285e9679098

Node inputs:

derivatives = /scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep freesurfer_home = /opt/freesurfer minimum_fs_version = 7.0.0 overwrite_fsaverage = False spaces = ['fsaverage5', 'fsaverage'] subjects_dir = /scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer

Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/plugins/multiproc.py", line 344, in _send_procs_to_workers self.procs[jobid].run(updatehash=updatehash) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run result = self._run_interface(execute=True) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface return self._run_command(execute) File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command raise NodeExecutionError(msg) nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20230324_121852_ecc545d1_9b63_4c9e_b1c4_d285e9679098.

Traceback: Traceback (most recent call last): File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 398, in run runtime = self._run_interface(runtime) File "/opt/conda/lib/python3.9/site-packages/niworkflows/interfaces/bids.py", line 962, in _run_interface shutil.copytree(source, dest, copy_function=shutil.copy) File "/opt/conda/lib/python3.9/shutil.py", line 566, in copytree return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks, File "/opt/conda/lib/python3.9/shutil.py", line 522, in _copytree raise Error(errors) shutil.Error: [('/opt/freesurfer/subjects/fsaverage/mri.2mm', '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/mri.2mm', "[Errno 17] File exists: '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/mri.2mm'"), ('/opt/freesurfer/subjects/fsaverage/scripts', '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/scripts', "[Errno 17] File exists: '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/scripts'"), ('/opt/freesurfer/subjects/fsaverage/surf', '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/surf', "[Errno 17] File exists: '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/surf'"), ('/opt/freesurfer/subjects/fsaverage/xhemi', '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/xhemi', "[Errno 17] File exists: '/scratch1/03201/jbwexler/openneuro_derivatives/derivatives/fmriprep/ds002843-fmriprep/sourcedata/freesurfer/fsaverage/xhemi'")]

effigies commented 1 year ago

When running locally, I have had to datalad unlock my FreeSurfer directories to prevent git annex being unhappy. I'm not sure if this is related.

liuzhenqi77 commented 1 year ago

Hi, I'm meeting a potentially similar issue using 23.0.2. I went back to older versions, 22.0.0 seems fine, but the error appears in 22.0.1, so I'm thinking the fsaverage folder remove & update was causing the issue? Also, it happened when I run 3 or 5 subjects in parallel with slurm (from log the node error happened within 3s), but did not happen when I run just one subject.

Maybe it's caused by multiple instances trying to modify the fsaverage folder at the same time? (Weirdly I did not met this issue when running 5 subjects in parallel (node ran within 3s) on a local server with SGE using 23.0.2, probably because local server has much better IO performance?)

nipreps/fmriprep/pull/2847 nipreps/niworkflows/pull/739

230425-22:11:58,449 nipype.workflow DEBUG:
     Allocating fmriprep_23_0_wf.fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a ID=0 (0.20GB, 1 threads). Free: 226.02GB, 63 threads.
230425-22:11:58,449 nipype.workflow DEBUG:
     [Node] Not cached "/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a".
230425-22:11:58,449 nipype.workflow DEBUG:
     Checking hash "fmriprep_23_0_wf.fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a" locally: cached=False, updated=False.
230425-22:11:58,449 nipype.workflow DEBUG:
     Running node fmriprep_23_0_wf.fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a on master thread
230425-22:11:58,449 nipype.workflow INFO:
     [Node] Setting-up "fmriprep_23_0_wf.fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a" in "/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a".
230425-22:11:58,450 nipype.workflow DEBUG:
     [Node] Not cached "/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a".
230425-22:11:58,451 nipype.utils DEBUG:
     Removing contents of /wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a
230425-22:11:58,453 nipype.workflow DEBUG:
     [Node] Writing pre-exec report to "/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a/_report/report.rst"
230425-22:11:58,487 nipype.workflow INFO:
     [Node] Executing "fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a" <niworkflows.interfaces.bids.BIDSFreeSurferDir>
230425-22:12:06,385 nipype.workflow INFO:
     [Node] Finished "fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a", elapsed time 7.620329s.
230425-22:12:06,385 nipype.workflow DEBUG:
     Saving results file: '/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a/result_fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a.pklz'
230425-22:12:06,385 nipype.workflow WARNING:
     Storing result file without outputs
230425-22:12:06,401 nipype.workflow WARNING:
     [Node] Error on "fmriprep_23_0_wf.fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a" (/wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a)
230425-22:12:06,402 nipype.workflow DEBUG:
     Clearing 0 from queue
230425-22:12:06,403 nipype.utils DEBUG:
     Loading pkl: /wd/fmriprep_23_0_wf/fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a/result_fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a.pklz
230425-22:12:06,403 nipype.workflow ERROR:
     Node fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a failed to run on host nc30408.narval.calcul.quebec.
230425-22:12:06,404 nipype.workflow ERROR:
     Saving crash info to /out/logs/crash-20230425-221206-zqliu-fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a-61e465b5-0e0f-4e49-a8f7-355674cc43b4.txt
Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/plugins/multiproc.py", line 344, in _send_procs_to_workers
    self.procs[jobid].run(updatehash=updatehash)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a.

Traceback:
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 397, in run
        runtime = self._run_interface(runtime)
      File "/opt/conda/lib/python3.9/site-packages/niworkflows/interfaces/bids.py", line 969, in _run_interface
        shutil.copytree(source, dest, copy_function=shutil.copy)
      File "/opt/conda/lib/python3.9/shutil.py", line 566, in copytree
        return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
      File "/opt/conda/lib/python3.9/shutil.py", line 522, in _copytree
        raise Error(errors)
    shutil.Error: [('/opt/freesurfer/subjects/fsaverage/label/lh.PALS_B12_Lobes.annot', '/out/sourcedata/freesurfer/fsaverage/label/lh.PALS_B12_Lobes.annot', "[Errno 2] No such file or directory: '/out/sourcedata/freesurfer/fsaverage/label/lh.PALS_B12_Lobes.annot'"), ('/opt/freesurfer/subjects/fsaverage/mri', '/out/sourcedata/freesurfer/fsaverage/mri', "[Errno 17] File exists: '/out/sourcedata/freesurfer/fsaverage/mri'"), ('/opt/freesurfer/subjects/fsaverage/surf', '/out/sourcedata/freesurfer/fsaverage/surf', "[Errno 17] File exists: '/out/sourcedata/freesurfer/fsaverage/surf'")]

230425-22:12:06,461 nipype.workflow CRITICAL:
     fMRIPrep failed: Traceback (most recent call last):
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/plugins/multiproc.py", line 344, in _send_procs_to_workers
    self.procs[jobid].run(updatehash=updatehash)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 527, in run
    result = self._run_interface(execute=True)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 645, in _run_interface
    return self._run_command(execute)
  File "/opt/conda/lib/python3.9/site-packages/nipype/pipeline/engine/nodes.py", line 771, in _run_command
    raise NodeExecutionError(msg)
nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node fsdir_run_20230425_212539_24cd804c_9cdf_4609_b269_e975ab353e4a.

Traceback:
    Traceback (most recent call last):
      File "/opt/conda/lib/python3.9/site-packages/nipype/interfaces/base/core.py", line 397, in run
        runtime = self._run_interface(runtime)
      File "/opt/conda/lib/python3.9/site-packages/niworkflows/interfaces/bids.py", line 969, in _run_interface
        shutil.copytree(source, dest, copy_function=shutil.copy)
      File "/opt/conda/lib/python3.9/shutil.py", line 566, in copytree
        return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
      File "/opt/conda/lib/python3.9/shutil.py", line 522, in _copytree
        raise Error(errors)
    shutil.Error: [('/opt/freesurfer/subjects/fsaverage/label/lh.PALS_B12_Lobes.annot', '/out/sourcedata/freesurfer/fsaverage/label/lh.PALS_B12_Lobes.annot', "[Errno 2] No such file or directory: '/out/sourcedata/freesurfer/fsaverage/label/lh.PALS_B12_Lobes.annot'"), ('/opt/freesurfer/subjects/fsaverage/mri', '/out/sourcedata/freesurfer/fsaverage/mri', "[Errno 17] File exists: '/out/sourcedata/freesurfer/fsaverage/mri'"), ('/opt/freesurfer/subjects/fsaverage/surf', '/out/sourcedata/freesurfer/fsaverage/surf', "[Errno 17] File exists: '/out/sourcedata/freesurfer/fsaverage/surf'")]
effigies commented 1 year ago

Also, it happened when I run 3 or 5 subjects in parallel with slurm (from log the node error happened within 3s), but did not happen when I run just one subject.

This will be the problem. I suppose we can try to add a lockfile to prevent it, but that's really chancy on networked filesystems.

jbwexler commented 1 year ago

@liuzhenqi77 If you haven't tried this already, a workaround that worked for me is to simply copy in fsaverage prior to running fmriprep.

liuzhenqi77 commented 1 year ago

Thanks for the update! Yes I'm doing something similar.

jbwexler commented 1 year ago

I'm going to close this issue since we have a reliable work around. I think @effigies plans to eventually provide a fix in fmriprep.