poldracklab / tacc-openneuro

0 stars 1 forks source link

MRIQC: "Permission denied: '/home/mriqc/.cache'" #86

Open jbwexler opened 3 weeks ago

jbwexler commented 3 weeks ago

Getting this for all MRIQC runs for any dataset:


  File "/opt/conda/lib/python3.11/pathlib.py", line 1116, in mkdir
    os.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mriqc/.cache/datalad/sockets'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/pathlib.py", line 1116, in mkdir
    os.mkdir(self, mode)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mriqc/.cache/datalad'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/opt/conda/lib/python3.11/site-packages/mriqc/cli/workflow.py", line 56, in build_workflow
    retval['workflow'] = init_mriqc_wf()
                         ^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/mriqc/workflows/core.py", line 48, in init_mriqc_wf
    workflow.add_nodes([fmri_qc_workflow()])
                        ^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/mriqc/workflows/functional/base.py", line 85, in fmri_qc_workflow
    _datalad_get(dataset)
  File "/opt/conda/lib/python3.11/site-packages/mriqc/utils/misc.py", line 257, in _datalad_get
    return get(
           ^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py", line 773, in eval_func
    return return_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/interface/base.py", line 763, in return_func
    results = list(results)
              ^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad_next/patches/interface_utils.py", line 197, in
_execute_command_
    hooks = get_jsonhooks_from_config(ds.config if ds else dlcfg)
                                      ^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/distribution/dataset.py", line 331, in config
    repo = self.repo
           ^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/distribution/dataset.py", line 273, in repo
    self._repo = repo_from_path(self._path)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/core/local/repo.py", line 61, in repo_from_path
    repo = cls(path, create=False, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/dataset/repo.py", line 163, in __call__
    instance = type.__call__(cls, *new_args, **new_kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.11/site-packages/datalad/support/annexrepo.py", line 198, in __init__
    super(AnnexRepo, self).__init__(
  File "/opt/conda/lib/python3.11/site-packages/datalad/support/gitrepo.py", line 938, in __init__
    ssh_manager.ensure_initialized()
  File "/opt/conda/lib/python3.11/site-packages/datalad/support/sshconnector.py", line 738, in ensure_i
nitialized
    self._socket_dir.mkdir(exist_ok=True, parents=True)
  File "/opt/conda/lib/python3.11/pathlib.py", line 1120, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/opt/conda/lib/python3.11/pathlib.py", line 1120, in mkdir
    self.parent.mkdir(parents=True, exist_ok=True)
  File "/opt/conda/lib/python3.11/pathlib.py", line 1116, in mkdir
    os.mkdir(self, mode)
PermissionError: [Errno 13] Permission denied: '/home/mriqc/.cache'```
yarikoptic commented 1 week ago

could you remind how container executed in your case? if via our singularity_cmd then we might need to make sure that we have ~/.cache in our "fake HOME". but also we might need to fix datalad -- it should create all leading folders there and not assume ~/.cache to be present.

jbwexler commented 1 week ago

I'm using ReproMan to submit a slurm job on TACC's Frontera. I also add --bind /tmp:/node_tmp into the singularity command so that it will use the node's local temp rather than the scratch temp. I add that into the containers/.datalad/config file so it runs that way automatically when I submit via reproman. Here's an example of a command-array file:

code/containers/scripts/singularity_cmd run --bind /tmp:/node_tmp code/containers/images/bids/bids-mriqc--24.0.0.sing sourcedata/raw /scratch1/03201/jbwexler/openneuro_derivatives/derivatives/mriqc/ds004488-mriqc participant --participant-label '01' -w '/node_tmp/work_dir/mriqc/ds004488_sub-01' -vv --nprocs 11 --ants-nthreads 8 --verbose-reports --dsname ds004488 --mem_gb 30
jbwexler commented 5 days ago

@yarikoptic Any ideas for a workaround?