Closed cmadjar closed 4 years ago
Follow up on where things are at on that front.
[x] LORIS API endpoints created (Point 1. of the description) [x] Crawler updated (Point 2. of the description)
A few notes regarding this Point 2. of the description:
[ERROR ] Failed to fetch https://openpreventad.loris.ca/api/v0.0.3-dev//candidates/2424540/NAPEN00/images/preventad_2424540_NAPEN00_fieldmap-phasediff_002.mnc/bids/sub-2424540_ses-NAPEN00_run-002_phasediff.json: cannot use a string pattern on a bytes-like object [base.py:_fetch:576,base.py:_verify_download:371,http.py:check_for_auth_failure:213,re.py:search:183]
=> Very short term solution: modify the DataLad code in virtualenv/lib/python3.7/site-packages/datalad/downloaders/http.py
line 210 to have the function check_for_auth_failure
return
:
def check_for_auth_failure(self, content, err_prefix=""):
return
if self.failure_re:
# verify that we actually logged in
for failure_re in self.failure_re:
if re.search(failure_re, content):
raise AccessDeniedError(
err_prefix + "returned output which matches regular expression %s" % failure_re
)
Very hacky but it works until we get a more permanent solution on the DataLad front.
[ ] Create a new dataset for the BIDS PREVENT-AD dataset and link to original MINC dataset still to be done
The issue in Point 2. is resolved by https://github.com/datalad/datalad/pull/4540.
Note that a different PR will be sent to the datalad repo by Yarik to fix a few other issues at the same time.
So for now, to have a crawler version that works:
pip install .
pip install .
datalad install -r git@github.com:cmadjar/conp-dataset
cd conp-dataset
datalad create -d . projects/preventad-open-bids
datalad create-sibling-github -d projects/preventad-open-bids preventad-open-bids
.gitmodules
as follows[submodule "projects/preventad-open-bids"]
path = projects/preventad-open-bids>
url = https://github.com/cmadjar/preventad-open-bids.git
.datalad/providers/loris.cfg
with the following content:
[provider:loris-openpreventad]
url_re = https:\/\/openpreventad.loris.ca\/.*
credential = loris-openpreventad
authentication_type = loris-token
loris-token_failure_re = "User not authenticated"}$
[credential:loris-openpreventad] url = https://openpreventad.loris.ca/api/v0.0.3-dev/login type = loris-token
- add loris.cfg to git:
git add .datalad/providers/loris.cfg git commit -m 'adding the LORIS config file for the crawler'
- initialize the crawler:
datalad crawl-init --save --template=loris_bids_export url=https://openpreventad.loris.ca/api/v0.0.3-dev/projects/loris/bids apibase=https://openpreventad.loris.ca/api/v0.0.3-dev/
- crawl (LORIS username and password will be asked the first time):
datalad crawl
Steps above are mostly from that documentation about the creation of a new dataset via DataLad (except the datalad crawl part):
https://github.com/CONP-PCNO/conp-documentation/blob/master/datalad_dataset_addition_procedure.md
PREVENT-AD BIDS is now cooking... To be continued on Tuesday.
Now available on the portal: https://portal.conp.ca/dataset?id=projects/preventad-open-bids
The circle CI error might be related to https://github.com/CONP-PCNO/conp-dataset/issues/322 so I will create a new issue for that.
Purpose
The BIDS dataset has been created for PREVENT-AD but a few steps are still needed in order to release the BIDS dataset:
1) Create LORIS URL for the different files of the BIDS dataset so the files can be downloaded through the same LORIS token mechanism
2) Create/modify the datalad crawler to crawl the BIDS dataset from LORIS
3) Create a new dataset for the BIDS PREVENT-AD dataset crawled and link it to the original MINC dataset (see #230)