templateflow / python-client

A python client to query TemplateFlow via pyBIDS
https://templateflow.org/python-client/
Apache License 2.0
8 stars 10 forks source link

`tpl-onavg` dataset inaccessible via datalad #112

Closed prioux closed 5 months ago

prioux commented 8 months ago

I can't download tpl-onavg at all. I tried with datalad or with direct wget commands, and I've tried finding the files from other sources too and they also have disappeared. Can anyone try this please?

mgxd commented 8 months ago

Just tried setting up templateflow via datalad, and can confirm the error

>>> import templateflow as tf
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/templateflow.git/config download failed: Not Found                                                                  
[INFO   ] access to 1 dataset sibling public-s3 not auto-enabled, enable with:
|       datalad siblings -d "/private/tmp/tf" enable -s public-s3 
install(ok): /tmp/tf (dataset)
[INFO   ] Ensuring presence of Dataset(/tmp/tf) to get /tmp/tf 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore 
[INFO   ] https://github.com/templateflow/tpl-Fischer344/config download failed: Not Found                                                                    
[INFO   ] access to 3 dataset siblings box.com, public-s3, public-s3 not auto-enabled, enable with:                                                           
|       datalad siblings -d "/private/tmp/tf/tpl-Fischer344" enable -s SIBLING 
install(ok): /tmp/tf/tpl-Fischer344 (dataset)                                                                                                                 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152Lin/config download failed: Not Found                                                                     
[INFO   ] access to 3 dataset siblings box.com, public-s3, public-s3 not auto-enabled, enable with:                                                           
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152Lin" enable -s SIBLING 
install(ok): /tmp/tf/tpl-MNI152Lin (dataset)                                                                                                                  
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009aAsym/config download failed: Not Found                                                           
[INFO   ] access to 1 dataset sibling public-s3 not auto-enabled, enable with:                                                                                
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152NLin2009aAsym" enable -s public-s3 
install(ok): /tmp/tf/tpl-MNI152NLin2009aAsym (dataset)                                                                                                        
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009aSym/config download failed: Not Found                                                            
[INFO   ] access to 1 dataset sibling public-s3 not auto-enabled, enable with:                                                                                
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152NLin2009aSym" enable -s public-s3 
install(ok): /tmp/tf/tpl-MNI152NLin2009aSym (dataset)                                                                                                         
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009bAsym.git/config download failed: Not Found                                                       
[INFO   ] access to 1 dataset sibling public-s3 not auto-enabled, enable with:                                                                                
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152NLin2009bAsym" enable -s public-s3 
[INFO   ] Reset branch 'master' to 1478388a (from 3ee83762) to avoid a detached HEAD                                                                          
install(ok): /tmp/tf/tpl-MNI152NLin2009bAsym (dataset)                                                                                                        
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009bSym/config download failed: Not Found                                                            
[INFO   ] access to 1 dataset sibling public-s3 not auto-enabled, enable with:                                                                                
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152NLin2009bSym" enable -s public-s3 
install(ok): /tmp/tf/tpl-MNI152NLin2009bSym (dataset)                                                                                                         
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009cAsym/config download failed: Not Found                                                           
[INFO   ] access to 1 dataset sibling box.com not auto-enabled, enable with:                                                                                  
|       datalad siblings -d "/private/tmp/tf/tpl-MNI152NLin2009cAsym" enable -s box.com 
install(ok): /tmp/tf/tpl-MNI152NLin2009cAsym (dataset)                                                                                                        
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin2009cSym/config download failed: Not Found                                                            
install(ok): /tmp/tf/tpl-MNI152NLin2009cSym (dataset)                                                                                                         
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin6Asym/config download failed: Not Found                                                               
install(ok): /tmp/tf/tpl-MNI152NLin6Asym (dataset)                                                                                                            
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI152NLin6Sym/config download failed: Not Found                                                                
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNI305.git/config download failed: Not Found                                                                    
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNIColin27/config download failed: Not Found                                                                    
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNIInfant/config download failed: Not Found                                                                     
[INFO   ] access to 4 dataset siblings box.com, public-s3, public-s3, public-s3 not auto-enabled, enable with:                                                
|       datalad siblings -d "/private/tmp/tf/tpl-MNIInfant" enable -s SIBLING 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MNIPediatricAsym/config download failed: Not Found                                                              
[INFO   ] access to 3 dataset siblings prune-s3, public-s3-2, public-s3 not auto-enabled, enable with:                                                        
|       datalad siblings -d "/private/tmp/tf/tpl-MNIPediatricAsym" enable -s SIBLING 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-MouseIn/config download failed: Not Found                                                                       
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-NKI/config download failed: Not Found                                                                           
[INFO   ] access to 5 dataset siblings box.com, public-s3, public-s3, public-s3, public-s3 not auto-enabled, enable with:                                     
|       datalad siblings -d "/private/tmp/tf/tpl-NKI" enable -s SIBLING 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-NMT31Sym/config download failed: Not Found                                                                      
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-OASIS30ANTs/config download failed: Not Found                                                                   
[INFO   ] access to 5 dataset siblings box.com, public-s3, public-s3, public-s3, public-s3 not auto-enabled, enable with:                                     
|       datalad siblings -d "/private/tmp/tf/tpl-OASIS30ANTs" enable -s SIBLING 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-PNC/config download failed: Not Found                                                                           
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-RESILIENT.git/config download failed: Not Found                                                                 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-UNCInfant.git/config download failed: Not Found                                                                 
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-VALiDATe29.git/config download failed: Not Found                                                                
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-WHS/config download failed: Not Found                                                                           
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-fsLR/config download failed: Not Found                                                                          
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-fsaverage/config download failed: Not Found                                                                     
[INFO   ] Remote origin not usable by git-annex; setting annex-ignore                                                                                         
[INFO   ] https://github.com/templateflow/tpl-onavg.git/config download failed: Not Found                                                                     
  [17 similar messages have been suppressed; disable with datalad.ui.suppress-similar-results=off]                                                            
action summary:
  install (ok: 27)
>>> tf.api.get('onavg')
get(error): tpl-onavg/tpl-onavg_hemi-L_den-10k_sphere.surf.gii (file) [not available; (Note that these git remotes have annex-ignore set: origin)]            
action summary:
  get (error: 1, notneeded: 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/templateflow/conf/__init__.py", line 69, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/templateflow/api.py", line 139, in get
    _datalad_get(filepath)
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/templateflow/api.py", line 278, in _datalad_get
    api.get(filepath, dataset=str(TF_LAYOUT.root))
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/datalad/interface/base.py", line 773, in eval_func
    return return_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/datalad/interface/base.py", line 763, in return_func
    results = list(results)
              ^^^^^^^^^^^^^
  File "/.pyenv/versions/templateflow/lib/python3.12/site-packages/datalad/interface/base.py", line 940, in _execute_command_
    raise IncompleteResultsError(
datalad.support.exceptions.IncompleteResultsError: Command did not complete successfully. 1 failed:
[{'action': 'get',
  'annexkey': 'MD5E-s285623--c48448b10dfcc2581dca280c2f8a9707.surf.gii',
  'message': 'not available; (Note that these git remotes have annex-ignore '
             'set: origin)',
  'path': '/tmp/tf/tpl-onavg/tpl-onavg_hemi-L_den-10k_sphere.surf.gii',
  'refds': '/tmp/tf',
  'status': 'error',
  'type': 'file'}]

Disabling datalad (using S3) seems fine though

Python 3.12.0 (main, Oct 18 2023, 16:09:26) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import templateflow as tf
t>>> tf.__version__
'23.1.0'
>>> tf.api.get('onavg')
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-L_den-10k_sphere.surf.gii
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 286k/286k [00:00<00:00, 3.08MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-L_den-41k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.17M/1.17M [00:00<00:00, 5.40MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-L_den-164k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.88M/4.88M [00:00<00:00, 11.3MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-L_den-655k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17.7M/17.7M [00:01<00:00, 14.0MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-R_den-10k_sphere.surf.gii
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 285k/285k [00:00<00:00, 1.74MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-R_den-41k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1.17M/1.17M [00:01<00:00, 1.09MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-R_den-164k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.88M/4.88M [00:00<00:00, 11.1MB/s]
Downloading https://templateflow.s3.amazonaws.com/tpl-onavg/tpl-onavg_hemi-R_den-655k_sphere.surf.gii
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 17.7M/17.7M [00:01<00:00, 15.6MB/s]
[PosixPath('/.cache/templateflow/tpl-onavg/CHANGES'), PosixPath('/.cache/templateflow/tpl-onavg/LICENSE'), PosixPath('/.cache/templateflow/tpl-onavg/README.md'), PosixPath('/.cache/templateflow/tpl-onavg/template_description.json'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-L_den-10k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-L_den-41k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-L_den-164k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-L_den-655k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-R_den-10k_sphere.surf.gii'), PosixPath(/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-R_den-41k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-R_den-164k_sphere.surf.gii'), PosixPath('/.cache/templateflow/tpl-onavg/tpl-onavg_hemi-R_den-655k_sphere.surf.gii')]
effigies commented 8 months ago

It looks like the S3 remote is not set up. Compare:

templateflow on  master via 🅒 default on ☁️   
❯ git -C tpl-MNI152NLin2009cAsym annex whereis tpl-MNI152NLin2009cAsym_res-02_T1w.nii.gz
whereis tpl-MNI152NLin2009cAsym_res-02_T1w.nii.gz (5 copies) 
    1263c789-52e6-4680-9a16-e0185baa9288 -- oesteban@oscars-MacBook-Pro.local:~/tmp/templateflow/tpl-MNI152NLin2009cAsym
    176ef808-c604-4da9-948b-7f1a21bcbddc -- oesteban@oscars-MacBook-Pro.local:~/tmp/templateflow/tpl-MNI152NLin2009cAsym
    25e2f6b6-0864-4354-8547-7ad2d78c8333 -- oesteban@dendrite:~/templateflow/tpl-MNI152NLin2009cAsym
    d7f1d2b9-afae-47ab-8d51-19fee65c97c3 -- [s3]
    f6f04341-e9e7-4b1e-af0c-d26a99cb9c2a -- [gin-src]

  s3: https://templateflow.s3.amazonaws.com/tpl-MNI152NLin2009cAsym/tpl-MNI152NLin2009cAsym_res-02_T1w.nii.gz?versionId=ciXAVlwcATnM9KpTmIIRZ2wMboo7aSMA
ok
templateflow on  master via 🅒 default on ☁️   
❯ git -C tpl-onavg annex whereis tpl-onavg_hemi-L_den-10k_sphere.surf.gii 
whereis tpl-onavg_hemi-L_den-10k_sphere.surf.gii (2 copies) 
    9701729f-7e38-464d-928e-61b24600ab10 -- git@0f827bfc83f9:/data/repos/templateflow/tpl-onavg.git
    cc2a0eb1-128e-4b72-86d5-7e92240914c3 -- OpenNeuro Average (onavg) surface template
ok
mgxd commented 8 months ago

Trying to figure out what happened - tpl-onavg was added via the auto-intake action, which seems to account for updating the remotes: https://github.com/templateflow/actions-intake/blob/ebc61a7f496781153ea641f67c3b871f060b8076/entrypoint.sh#L109-L121

However, I'm not seeing it in the tpl-onavg history https://github.com/templateflow/tpl-onavg/commits/a650e2a72a0d77cb7f1df032f9c3aa54bc61deb3

prioux commented 8 months ago

I was able to download the files using S3. I generally cannot let the python templateflow libraries download files as needed, in my specific environment, so I prepare in advance a full set of all templates that I pack in a SquashFS file mounted by my tool containers.

Now that I have tpl-onavg, I rebuilt that SquashFS file and my tools work properly (the templaflow libraries crashed because, of course, TEMPLATEFLOW_HOME points to the SquashFS filesystem, which is read only).

I will let you guys make the necessary adjustments to the repos here, feel free to close this ticket at any time. Thanks!

effigies commented 5 months ago

This should be resolved. The issue was that there were multiple S3 remotes specified in datalad, so none got autoenabled. Two have been marked dead, so future clones should work as expected.