nipreps / niworkflows

Common workflows for MRI (anatomical, functional, diffusion, etc)
https://www.nipreps.org/niworkflows
Apache License 2.0
88 stars 52 forks source link

Pre-run freesurfer folder needs to be writable ? #794

Open ins0mniac2 opened 1 year ago

ins0mniac2 commented 1 year ago

We are trying to re-run a dataset previously processed with fmriprep 20.2.0, now with 20.2.7, re-using the freesurfer runs from 20.2.0 . The 20.2.0 runs exist on a read-only folder. It seems that fmriprep does recognize the pre-existing freesurfer run, but still throws errors such as below, and doesn't produce different template space outputs, nor any anatomical reports. When I copy the freesurfer folder to a writable folder, it works as expected. It seems that it doesn't really modify or create any new files in the pre-existing, yet doesn't like the read-only folder.

Traceback (most recent call last): File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1241, in mkdir self._accessor.mkdir(self, mode) FileNotFoundError: [Errno 2] No such file or directory: '/DATA/foo/foobar' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/interfaces/base/core.py", line 398, in run runtime = self._run_interface(runtime) File "/usr/local/miniconda/lib/python3.7/site-packages/niworkflows/interfaces/bids.py", line 837, in _run_interface subjects_dir.mkdir(parents=True, exist_ok=True) File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1245, in mkdir self.parent.mkdir(parents=True, exist_ok=True) File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1245, in mkdir self.parent.mkdir(parents=True, exist_ok=True) File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1245, in mkdir self.parent.mkdir(parents=True, exist_ok=True) [Previous line repeated 2 more times] File "/usr/local/miniconda/lib/python3.7/pathlib.py", line 1241, in mkdir self._accessor.mkdir(self, mode) OSError: [Errno 30] Read-only file system: '/DATA/foo'
effigies commented 1 year ago

Hm. I'm surprised that exist_ok doesn't cover this case. The failure is clearly in niworkflows, so I'll transfer over there...

ins0mniac2 commented 1 year ago

I have since done a couple more experiments.

  1. When the top level freesurfer folder is writable, but the sub-XXXX, fsaverage and fsaverage5 (we want results in those spaces) subfolders are read-only and local copies (not symlinks), it works fine.
  2. However, when the top level freesurfer folder is writable, but the sub-XXXX, fsaverage and fsaverage5 (we want results in those spaces) subfolders are symlinks to read-only datalad repository locations, it fails, although I get a different error as below where it seems like it is trying to run autorecon:

============= Node: fmriprep_wf.single_subject_XXXX_wf.anat_preproc_wf.surface_recon_wf.autorecon1

Traceback (most recent call last): File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/plugins/multiproc.py", line 67, in run_node result["result"] = node.run(updatehash=updatehash) File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 521, in run result = self._run_interface(execute=True) File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 639, in _run_interface return self._run_command(execute) File "/usr/local/miniconda/lib/python3.7/site-packages/nipype/pipeline/engine/nodes.py", line 751, in _run_command f"Exception raised while executing Node {self.name}.\n\n{result.runtime.traceback}" nipype.pipeline.engine.nodes.NodeExecutionError: Exception raised while executing Node autorecon1. RuntimeError: subprocess exited with code 1.
effigies commented 1 year ago

Ah. I have had issues with freesurfer and datalad. I've found datalad unlock generally makes everything better.

ins0mniac2 commented 1 year ago

We could clone the existing run, but the purpose of using links to the datalad repository (which is read-only) is to avoid duplicating all that disk space while we process the large dataset. If we datalad unlock on the clone, that means yet another copy :-( .

effigies commented 1 year ago

Datalad unlock should not make copies. It performs git-link trickery. Have you found that doing so increases your disk usage?

ins0mniac2 commented 1 year ago

I have been told that it not only unlinks the files but the files in git annex remain, doubling disk space. I just tested and indeed unlock takes the disk usage for my test freesurfer folder from 452M to 927M.

effigies commented 1 year ago

Interesting. When I unlock, the files become git links. I haven't tested whether that makes it appear doubled to du. What about df? Do you see a 450MB difference there?

ins0mniac2 commented 1 year ago

Yes, it does seem like df sees the difference as well. Below, there is a ~485000K difference.

$ df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdd 38910016440 23133702732 13823016508 63% /DATA $ datalad unlock . unlock(ok): label/BA_exvivo.ctab (file)
unlock(ok): label/BA_exvivo.thresh.ctab (file)
unlock(ok): label/aparc.annot.DKTatlas.ctab (file)
unlock(ok): label/aparc.annot.a2009s.ctab (file)
unlock(ok): label/aparc.annot.ctab (file)
unlock(ok): label/lh.BA1_exvivo.label (file)
unlock(ok): label/lh.BA1_exvivo.thresh.label (file)
unlock(ok): label/lh.BA2_exvivo.label (file)
unlock(ok): label/lh.BA2_exvivo.thresh.label (file)
unlock(ok): label/lh.BA3a_exvivo.label (file)
[316 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]
action summary: unlock (ok: 326) $ df . Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdd 38910016440 23134187776 13822531464 63% /DATA

effigies commented 1 year ago

Hmm. Okay. It looks like you can have it use hard-links instead of making copies by setting the config option annex.thin=true. This only works if you're comfortable with throwing away the FS directory because you have another copy somewhere else, as it does present the possibility of corrupting the annexed files.