NBCLab / power-replication

A replication and extension of Power et al. (2018)
https://www.overleaf.com/read/swgjxcjqytxg
Apache License 2.0
2 stars 0 forks source link

Templateflow issue with MRIQC #6

Closed tsalo closed 3 years ago

tsalo commented 3 years ago

@mriedel56 I am getting errors when I submit MRIQC jobs. I bound my templateflow cache directory, but MRIQC still tries to download the package. Any ideas?

tsalo commented 3 years ago

Here's the relevant code:

https://github.com/NBCLab/power-replication/blob/866557263451c76379239eb9c0c8bf9eb12eb233/mriqc_job_template.sh#L29-L35

emdupre commented 3 years ago

I had seen a similar issue on NeuroStars recently, in case it's helpful ?

mriedel56 commented 3 years ago

This problem might be specific to fMRIPREP (not MRIQC, as you've requested here), but I found with fMRIPREP you must include the argument:

--skull-strip-template OASIS30ANTs:res-1

because if you dont, it assumes:

--skull-strip-template OASIS30ANTs

If neither solution helps your problem, can you provide an example of the error?

EDIT: I should clarify, if you dont correctly specify the above command, it will initiate the templateflow download.

tsalo commented 3 years ago

Thank you both! I will test out both fixes tomorrow.

tsalo commented 3 years ago

I'm stumped. The error relates to trying to download tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_desc-fMRIPrep_boldref.nii.gz, which exists in my templateflow directory (/home/tsalo006/.cache/templateflow/tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_desc-fMRIPrep_boldref.nii.gz).

I've thrown every solution listed in that NeuroStars post and in fMRIPrep's documentation into the job file, and nothing seems to work. Attached are the job script and the error file.

The submission script

#!/bin/bash
#---Number of cores
#SBATCH -c 1

#---Job's name in SLURM system
#SBATCH -J sub-03_mriqc

#---Error file
#SBATCH -e /home/data/nbc/misc-projects/Salo_PowerReplication/code/jobs/mriqc_sub-03_err

#---Output file
#SBATCH -o /home/data/nbc/misc-projects/Salo_PowerReplication/code/jobs/mriqc_sub-03_out

#---Queue name
#SBATCH --account iacc_nbc

#---Partition
#SBATCH -p default-partition
########################################################
export NPROCS=`echo $LSB_HOSTS | wc -w`
export OMP_NUM_THREADS=1
. $MODULESHOME/../global/profile.modules
module load singularity-3.5.3

DSET_DIR="/home/data/nbc/misc-projects/Salo_PowerReplication/dset-dupre/"
WORK_DIR="/scratch/nbc/tsalo006/dset-dupre-mriqc/"

export SINGULARITYENV_NO_ET=1
export SINGULARITYENV_TEMPLATEFLOW_HOME=/home/tsalo006/.cache/templateflow

# Run MRIQC
singularity run --home $HOME --cleanenv \
    -B /home/tsalo006/.cache/templateflow:$HOME/.cache/templateflow \
    /home/data/cis/singularity-images/poldracklab_mriqc_0.15.1.sif \
    $DSET_DIR $DSET_DIR/derivatives participant \
    --participant-label 03 \
    -w $WORK_DIR --no-sub \
    --nprocs 1

The error

Downloading https://templateflow.s3.amazonaws.com/tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_desc-fMRIPrep_boldref.nii.gz
Process Process-2:
Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connection.py", line 171, in _new_conn
    (self._dns_host, self.port), self.timeout, **extra_kw)
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/util/connection.py", line 79, in create_connection
    raise err
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/util/connection.py", line 69, in create_connection
    sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 849, in _validate_conn
    conn.connect()
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connection.py", line 314, in connect
    conn = self._new_conn()
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connection.py", line 180, in _new_conn
    self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.VerifiedHTTPSConnection object at 0x7f0241eb78d0>: Failed to establish a new connection: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/adapters.py", line 445, in send
    timeout=timeout
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/miniconda/lib/python3.7/site-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_desc-fMRIPrep_boldref.nii.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0241eb78d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "/usr/local/miniconda/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/miniconda/lib/python3.7/site-packages/mriqc/bin/mriqc_run.py", line 418, in init_mriqc
    wf_list.append(build_workflow(dataset[mod], mod, settings=settings))
  File "/usr/local/miniconda/lib/python3.7/site-packages/mriqc/workflows/core.py", line 25, in build_workflow
    workflow = fmri_qc_workflow(dataset, settings=settings)
  File "/usr/local/miniconda/lib/python3.7/site-packages/mriqc/workflows/functional.py", line 111, in fmri_qc_workflow
    ema = epi_mni_align(settings)
  File "/usr/local/miniconda/lib/python3.7/site-packages/mriqc/workflows/functional.py", line 714, in epi_mni_align
    'MNI152NLin2009cAsym', resolution=2, suffix='boldref')),
  File "/usr/local/miniconda/lib/python3.7/site-packages/templateflow/api.py", line 39, in get
    _s3_get(filepath)
  File "/usr/local/miniconda/lib/python3.7/site-packages/templateflow/api.py", line 130, in _s3_get
    r = requests.get(url, stream=True)
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/api.py", line 72, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/api.py", line 58, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/sessions.py", line 512, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/sessions.py", line 622, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/miniconda/lib/python3.7/site-packages/requests/adapters.py", line 513, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='templateflow.s3.amazonaws.com', port=443): Max retries exceeded with url: /tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_desc-fMRIPrep_boldref.nii.gz (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7f0241eb78d0>: Failed to establish a new connection: [Errno 110] Connection timed out'))
tsalo commented 3 years ago

Updating to 0.16.1 (newest release) doesn't fix the problem.

tsalo commented 3 years ago

Okay, I think I figured out the problem! After going through the fMRIPrep Singularity Troubleshooting section in more detail, I actually ran templateflow within the Singularity image and it downloaded a bunch of files. While the files seemed to exist in my home directory already, apparently that wasn't enough?

I'm currently getting some errors when I run, but they're unrelated to templateflow, so I'm going to close this now.

EDIT: Probably fixed in cc58dfb.