Open ins0mniac2 opened 1 year ago
Hm. I'm surprised that exist_ok
doesn't cover this case. The failure is clearly in niworkflows, so I'll transfer over there...
I have since done a couple more experiments.
============= Node: fmriprep_wf.single_subject_XXXX_wf.anat_preproc_wf.surface_recon_wf.autorecon1
Ah. I have had issues with freesurfer and datalad. I've found datalad unlock
generally makes everything better.
We could clone the existing run, but the purpose of using links to the datalad repository (which is read-only) is to avoid duplicating all that disk space while we process the large dataset. If we datalad unlock
on the clone, that means yet another copy :-( .
Datalad unlock should not make copies. It performs git-link trickery. Have you found that doing so increases your disk usage?
I have been told that it not only unlinks the files but the files in git annex remain, doubling disk space. I just tested and indeed unlock takes the disk usage for my test freesurfer folder from 452M to 927M.
Interesting. When I unlock, the files become git links. I haven't tested whether that makes it appear doubled to du
. What about df
? Do you see a 450MB difference there?
Yes, it does seem like df sees the difference as well. Below, there is a ~485000K difference.
$ df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdd 38910016440 23133702732 13823016508 63% /DATA
$ datalad unlock .
unlock(ok): label/BA_exvivo.ctab (file)
unlock(ok): label/BA_exvivo.thresh.ctab (file)
unlock(ok): label/aparc.annot.DKTatlas.ctab (file)
unlock(ok): label/aparc.annot.a2009s.ctab (file)
unlock(ok): label/aparc.annot.ctab (file)
unlock(ok): label/lh.BA1_exvivo.label (file)
unlock(ok): label/lh.BA1_exvivo.thresh.label (file)
unlock(ok): label/lh.BA2_exvivo.label (file)
unlock(ok): label/lh.BA2_exvivo.thresh.label (file)
unlock(ok): label/lh.BA3a_exvivo.label (file)
[316 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]
action summary:
unlock (ok: 326)
$ df .
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sdd 38910016440 23134187776 13822531464 63% /DATA
Hmm. Okay. It looks like you can have it use hard-links instead of making copies by setting the config option annex.thin=true
. This only works if you're comfortable with throwing away the FS directory because you have another copy somewhere else, as it does present the possibility of corrupting the annexed files.
We are trying to re-run a dataset previously processed with fmriprep 20.2.0, now with 20.2.7, re-using the freesurfer runs from 20.2.0 . The 20.2.0 runs exist on a read-only folder. It seems that fmriprep does recognize the pre-existing freesurfer run, but still throws errors such as below, and doesn't produce different template space outputs, nor any anatomical reports. When I copy the freesurfer folder to a writable folder, it works as expected. It seems that it doesn't really modify or create any new files in the pre-existing, yet doesn't like the read-only folder.