spine-generic / data-multi-subject

Multi-subject data for the Spine Generic project
Creative Commons Attribution 4.0 International
22 stars 15 forks source link

Data Not Downloading Error with "git annex get ." command #113

Closed hestoner closed 2 years ago

hestoner commented 2 years ago

Hi!

First off, thanks for making this data accessible. I'm sure it's a great resource once it's up and running correctly. I'm having some problems with the data downloading properly while using the "git annex get ." portion in the steps described here: https://github.com/spine-generic/data-multi-subject#spine-generic-public-database-multi-subject.

The output is as follows: image

I've never used git annex before but it did successfully say I had version 8.20200226.

Thanks for the help and hope you all have a great day,

Halestone

mguaypaq commented 2 years ago

Hi Halestone,

It looks like git-annex is aware that it can get the data from amazon (among other remotes), but somehow the amazon remote seems disabled for you. For comparison, when I tried the commands just now in a fresh clone, I got the data from amazon:

mguaypaq@taalo:/tmp/dms113$ git clone https://github.com/spine-generic/data-multi-subject
Cloning into 'data-multi-subject'...
remote: Enumerating objects: 49248, done.
remote: Counting objects: 100% (16042/16042), done.
remote: Compressing objects: 100% (5202/5202), done.
remote: Total 49248 (delta 8312), reused 15447 (delta 8036), pack-reused 33206
Receiving objects: 100% (49248/49248), 4.99 MiB | 10.18 MiB/s, done.
Resolving deltas: 100% (17731/17731), done.
mguaypaq@taalo:/tmp/dms113$ cd data-multi-subject/
mguaypaq@taalo:/tmp/dms113/data-multi-subject$ git annex get sub-amu01/
get sub-amu01/anat/sub-amu01_T1w.nii.gz (from amazon...) 

(scanning for annexed files...)       
ok
get sub-amu01/anat/sub-amu01_T2star.nii.gz (from amazon...) 

ok                                 
get sub-amu01/anat/sub-amu01_T2w.nii.gz (from amazon...) 

ok                                    
get sub-amu01/anat/sub-amu01_acq-MToff_MTS.nii.gz (from amazon...) 

ok                                
get sub-amu01/anat/sub-amu01_acq-MTon_MTS.nii.gz (from amazon...) 

ok                                   
get sub-amu01/anat/sub-amu01_acq-T1w_MTS.nii.gz (from amazon...) 

ok                                 
get sub-amu01/dwi/sub-amu01_acq-b0_dwi.nii.gz (from amazon...) 

ok                                 
get sub-amu01/dwi/sub-amu01_dwi.nii.gz (from amazon...) 

ok                                   
(recording state in git...)

To see the configuration options for the remotes, could you run the following command from the data-multi-subject folder, and post the output here, please?

git config --list | grep '^remote'

For comparison, this is what I get on my machine:

mguaypaq@taalo:/tmp/dms113/data-multi-subject$ git config --list | grep '^remote'
remote.origin.url=https://github.com/spine-generic/data-multi-subject
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.annex-ignore=true
remote.amazon.annex-s3=true
remote.amazon.annex-uuid=5a5447a8-a9b8-49bc-8276-01a62632b502

According to the git-annex docs, you should be able to specify which remote to use manually, by using the --from option to git annex get. You shouldn't need to do this usually, but just as a test, could you try the following command and post the output here as well?

git annex get sub-amu01/ --from=amazon 
hestoner commented 2 years ago

Hi Mathieu,

Sure thing! Thanks for the help. I just did a fresh clone and got the same errors previously. After using the git config line you mentioned, the output is as such:

image

So, the remote.amazon portions are missing on the config list. For the git annex specifying amazon command, this is the following output:

image

I'm assuming I need to make edits to the config file?

Thanks again and hope you have a great day,

Halestone

mguaypaq commented 2 years ago

I would really like to understand how this happened (so we can prevent it from happening again), but for now let's focus on getting it working at least.

On my machine, when I compare the output of git config --list before and after running git annex init, I see that these are the added bits of configuration:

remote.origin.annex-ignore=true
annex.uuid=fe423a27-9237-4973-9c94-121bf6dc232a
annex.version=8
filter.annex.smudge=git-annex smudge -- %f
filter.annex.clean=git-annex smudge --clean -- %f
remote.amazon.annex-s3=true
remote.amazon.annex-uuid=5a5447a8-a9b8-49bc-8276-01a62632b502

Do you see corresponding lines when you run git config --list on your machine?

Also, the output of git show git-annex:remote.log is:

5a5447a8-a9b8-49bc-8276-01a62632b502 autoenable=true bucket=data-multi-subject---spine-generic---neuropoly datacenter=ca-central-1 encryption=none host=s3.ca-central-1.amazonaws.com name=amazon port=443 public=yes publicurl=https://data-multi-subject---spine-generic---neuropoly.s3.ca-central-1.amazonaws.com signature=v4 storageclass=STANDARD type=S3 timestamp=1597347458.484621s

and the output of git show git-annex:uuid.log contains the line (among others):

...
5a5447a8-a9b8-49bc-8276-01a62632b502 amazon timestamp=1596600645.406808367s
...

Does this match what you have?

Regardless of the questions above, looking at the git-annex docs on public Amazon S3 remotes, the following command might fix things enough for git annex get to work:

git annex enableremote amazon

What do you get if you try that?

Thanks for your patience!

hestoner commented 2 years ago

Heyo!

Alrighty this is before and after running git annex init with the git config --list command:

image

I decided to try out git annex get . again after running these (because it seemed to have initialized the amazon source). And it worked! The data was downloaded properly! Thanks so much for the help. I really appreciate it. If I had realized it was just the one command I needed to use, I would have done so. XD

Cheers,

Halestone

mguaypaq commented 2 years ago

Ah! Yes, I see now that in the download instructions for data-single-subject, we list the git annex init command, but it's not in the download instructions for this dataset. I'll fix that right away.

Thanks again for your patience!