Closed renelabounek closed 1 year ago
Hi @renelabounek, thanks for reporting this issue. There are (at least) two things happening here:
origin
and upstream
repositories, so it's not even trying to download the files from Amazon S3. (git-annex tends to start ignoring a remote whenever it encounters problems with it.) You can confirm this by looking at the output of the following command (run inside the repository):
git config --list | grep '^remote'
If it contains lines with annex-ignore
, those remotes are currently ignored by git-annex. You can undo this with commands like:
git config --unset remote.origin.annex-ignore
git config --unset remote.upstream.annex-ignore
You don't need to try this right now though; we'll let you know once the Amazon S3 permissions issue is fixed.
Hi @renelabounek, thanks for reporting this issue. There are (at least) two things happening here:
- We recently changed how the access permissions work on Amazon S3, where these files are stored, so currently they're not able to be downloaded. We'll get this fixed quickly, and we'll let you know as soon as things are back to normal.
But that's not the error message you're getting. Instead, it seems that git-annex has decided to ignore the
origin
andupstream
repositories, so it's not even trying to download the files from Amazon S3. (git-annex tends to start ignoring a remote whenever it encounters problems with it.) You can confirm this by looking at the output of the following command (run inside the repository):git config --list | grep '^remote'
If it contains lines with
annex-ignore
, those remotes are currently ignored by git-annex. You can undo this with commands like:git config --unset remote.origin.annex-ignore git config --unset remote.upstream.annex-ignore
You don't need to try this right now though; we'll let you know once the Amazon S3 permissions issue is fixed.
Ok I will do step 2, when you let me know that the issue 1 is fixed. Thanks, Rene
Ok, I think we don't need the origin
and upstream
remotes, at least for now, so please ignore step 2 from my previous comment. Can you try these commands:
git annex enableremote amazon public=yes
git annex get sub-amu01/anat/sub-amu01_T1w.nii.gz
If that works, then you should be able to retry downloading everything:
git annex get .
Otherwise, could you post the results of these commands please?
git config --list | grep '^remote'
git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
It di not pass. STDOUTs are below. I have not done the step 2 (i.e. the git config --unset commands), as you said not to do.
(base) [labounek@porto data-multi-subject]$ git annex enableremote amazon public=yes
enableremote amazon ok
(recording state in git...)
(base) [labounek@porto data-multi-subject]$ git annex get sub-amu01/anat/sub-amu01_T1w.nii.gz
get sub-amu01/anat/sub-amu01_T1w.nii.gz (not available)
No other repository is known to contain the file.
(Note that these git remotes have annex-ignore set: origin upstream)
failed
get: 1 failed
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git config --list | grep '^remote'
remote.origin.url=https://github.com/renelabounek/data-multi-subject.git
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.annex-ignore=true
remote.amazon.annex-s3=true
remote.amazon.annex-uuid=5a5447a8-a9b8-49bc-8276-01a62632b502
remote.amazon.annex-ignore=false
remote.amazon.skipfetchall=true
remote.upstream.url=https://github.com/spine-generic/data-multi-subject
remote.upstream.fetch=+refs/heads/*:refs/remotes/upstream/*
remote.upstream.annex-ignore=true
remote.upstream.annex-readonly=true
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
whereis sub-amu01/anat/sub-amu01_T1w.nii.gz (0 copies) failed
whereis: 1 failed
(base) [labounek@porto data-multi-subject]$
Ok, I can reproduce this problem, and I have some idea why it's happening, and a potential way to fix it (though not 100%).
The problem:
git-annex
(including, where to find contents of files).git pull
only gets data from remotes for the master
branch, not for the git-annex
branch.github.com/renelabounek
, and which is called origin
on your computer, only contains metadata updates until November 2021, in its git-annex
branch.github.com/spine-generic
, and which is called upstream
on your computer, contains a lot more metadata updates, in its git-annex
branch. This includes information about where to find the file sub-amu01/anat/sub-amu01_T1w.nii.gz
, for example.origin/git-annex
branch up to date with the upstream/git-annex
branch. Normally I would use this link to do a Github pull request from our repository to your fork, but there's a "Can’t automatically merge." error message. I'll have to look into this more later.upstream/git-annex
:
git fetch upstream git-annex
git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
git annex get sub-amu01/anat/sub-amu01_T1w.nii.gz
Nope, I am still not able to get the files.
(base) [labounek@porto data-multi-subject]$ git fetch upstream git-annex
remote: Enumerating objects: 57723, done.
remote: Counting objects: 100% (7627/7627), done.
remote: Compressing objects: 100% (6890/6890), done.
remote: Total 57723 (delta 41), reused 7624 (delta 39), pack-reused 50096
Receiving objects: 100% (57723/57723), 8.92 MiB | 1.80 MiB/s, done.
Resolving deltas: 100% (19046/19046), completed with 4 local objects.
From https://github.com/spine-generic/data-multi-subject
* branch git-annex -> FETCH_HEAD
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
whereis sub-amu01/anat/sub-amu01_T1w.nii.gz (0 copies) failed
whereis: 1 failed
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex get sub-amu01/anat/sub-amu01_T1w.nii.gz
get sub-amu01/anat/sub-amu01_T1w.nii.gz (not available)
No other repository is known to contain the file.
(Note that these git remotes have annex-ignore set: origin upstream)
failed
get: 1 failed
(base) [labounek@porto data-multi-subject]$
I'm having trouble reproducing this problem on my computer. Could you tell me the output of the following commands, so that I have a better idea what's happening?
git log -n 5 git-annex
git log -n 5 origin/git-annex
git log -n 5 upstream/git-annex
Thanks.
Thanks for your help. Here are the STDOUTs:
(base) [labounek@porto data-multi-subject]$ git log -n 5 git-annex
commit 01932205d298edd55d2857b5473293af9e59b1bd
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Wed Jul 20 15:05:35 2022 -0500
update
commit 7b2f84d2a3694bcb28d677f656bf401c053d6c5e
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Mon Nov 29 13:06:08 2021 -0600
update
commit 0a9c9e3a199a28e7cd0ebe8ca4513fbc6114d6eb
Merge: 5c768aa e40f90c
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Mon Nov 29 13:04:44 2021 -0600
merging upstream/git-annex into git-annex
commit 5c768aa91f43569eb78bedc9cd21bf09247927c5
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Tue Nov 23 13:44:52 2021 -0600
update
commit 5339869181c1bba10dc29a3cbbc1666f4039982f
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Tue Nov 23 12:43:51 2021 -0600
update
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git log -n 5 origin/git-annex
commit 7b2f84d2a3694bcb28d677f656bf401c053d6c5e
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Mon Nov 29 13:06:08 2021 -0600
update
commit 0a9c9e3a199a28e7cd0ebe8ca4513fbc6114d6eb
Merge: 5c768aa e40f90c
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Mon Nov 29 13:04:44 2021 -0600
merging upstream/git-annex into git-annex
commit 5c768aa91f43569eb78bedc9cd21bf09247927c5
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Tue Nov 23 13:44:52 2021 -0600
update
commit 5339869181c1bba10dc29a3cbbc1666f4039982f
Author: Rene Labounek <rene.labounek@gmail.com>
Date: Tue Nov 23 12:43:51 2021 -0600
update
commit e40f90cfd6e2fcf3cef82fd3a684a4337149511b
Author: Julien Cohen-Adad <jcohen@polymtl.ca>
Date: Fri Jun 25 20:59:18 2021 -0400
update
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git log -n 5 upstream/git-annex
commit e40f90cfd6e2fcf3cef82fd3a684a4337149511b
Author: Julien Cohen-Adad <jcohen@polymtl.ca>
Date: Fri Jun 25 20:59:18 2021 -0400
update
commit 322c62ee978dcf199d3a7c1133f877341ccdb321
Author: Alexandru Foias <afoias@polymtl.ca>
Date: Fri Apr 16 11:54:35 2021 -0400
update
commit 7eb7b8ed51a66743310637ba316ddd099577875a
Author: Alexandru Foias <afoias@polymtl.ca>
Date: Fri Apr 16 11:53:33 2021 -0400
update
commit 81864bf6121eeee7ce565f23203eda0651ce8850
Author: Alexandru Foias <afoias@polymtl.ca>
Date: Fri Apr 16 11:53:33 2021 -0400
update
commit c88d1bb7e45cea1bab6701410f79415d757df3f4
Author: Alexandru Foias <afoias@polymtl.ca>
Date: Tue Apr 13 15:15:06 2021 -0400
update
(base) [labounek@porto data-multi-subject]$
Thanks for your patience. After trying to reproduce the problem on my computer, I think we can try to merge the information from the upstream/git-annex
branch into the local git-annex
branch with the following commands:
git checkout git-annex
git merge --no-edit --allow-unrelated-histories upstream/git-annex
We expect this to produce lots of output, ending with:
...
Automatic merge failed; fix conflicts and then commit the result.
Fix the conflicts like this:
git -c 'mergetool.custom.cmd=(echo "$MERGED" | grep -Eq [.]log$) && cat "$LOCAL" "$REMOTE" | sort -nu > "$MERGED"' \
-c mergetool.custom.trustExitCode \
-c mergetool.keepBackup=false \
mergetool --tool=custom
This will probably take several minutes, and produce lots of output like this:
Normal merge conflict for 'ffd/627/SHA256E-s2732168--10acedb4cccb4804dd8b34d4a3ca6fb846528e291a045cb4140eccb2d4e305eb.nii.gz.log':
{local}: modified file
{remote}: modified file
Normal merge conflict for 'ffd/736/SHA256E-s1897274--090e2c5f0d417e6e858297cbf92ba6e2ecd8c2b80e86a21d88f9f06bb053e332.nii.gz.log':
{local}: modified file
{remote}: modified file
Normal merge conflict for 'ffd/c3d/SHA256E-s1592114--0202f1d6b6a8e937730f21641398ee338fe071f3d1866837f189fd7beb9f31f3.nii.gz.log':
{local}: modified file
{remote}: modified file
git -c core.editor=/bin/true merge --continue
This should produce a single line of output, like:
[git-annex 693c29b0e] Merge remote-tracking branch 'upstream/git-annex' into git-annex
master
branch, and check if git-annex now knows that files exist on amazon:
git checkout master
git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
git push origin git-annex:git-annex
If it still doesn't work I would need to know exactly which version of git-annex you have, which is in the output of the following command:
git annex version
Start the automatic merging process, which we expect to fail (we'll fix this with the following commands).
git checkout git-annex git merge --no-edit --allow-unrelated-histories upstream/git-annex
We expect this to produce lots of output, ending with:
... Automatic merge failed; fix conflicts and then commit the result.
My expected outputs started to appear different within the first set of commands. My git does not know allow-unrelated-histories
. Should I continue? Or do we need to fix this first?
(base) [labounek@porto data-multi-subject]$ git checkout git-annex
Switched to branch 'git-annex'
Your branch is ahead of 'upstream/git-annex' by 5 commits.
(use "git push" to publish your local commits)
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git merge --no-edit --allow-unrelated-histories upstream/git-annex
error: unknown option `allow-unrelated-histories'
usage: git merge [options] [<commit>...]
or: git merge [options] <msg> HEAD <commit>
or: git merge --abort
-n do not show a diffstat at the end of the merge
--stat show a diffstat at the end of the merge
--summary (synonym to --stat)
--log[=<n>] add (at most <n>) entries from shortlog to merge commit message
--squash create a single commit instead of doing a merge
--commit perform a commit if the merge succeeds (default)
-e, --edit edit message before committing
--ff allow fast-forward (default)
--ff-only abort if fast-forward is not possible
--rerere-autoupdate update the index with reused conflict resolution if possible
--verify-signatures Verify that the named commit has a valid GPG signature
-s, --strategy <strategy>
merge strategy to use
-X, --strategy-option <option=value>
option for selected merge strategy
-m, --message <message>
merge commit message (for a non-fast-forward merge)
-v, --verbose be more verbose
-q, --quiet be more quiet
--abort abort the current in-progress merge
--progress force progress reporting
-S, --gpg-sign[=<key id>]
GPG sign commit
--overwrite-ignore update ignored files (default)
(base) [labounek@porto data-multi-subject]$
I assume it is a git version issue. Unfortunatelly, I am not admin at the server where the database is stored. So update git can be tricky or even impossible for me. Any other possibility before I will write to server admins the update request?
(base) [labounek@porto data-multi-subject]$ git --version
git version 1.8.3.1
(base) [labounek@porto data-multi-subject]$
Yes, it looks like a git version issue. Thankfully, that option is only needed for recent versions of git to make them act like older versions of git, so you can safely remove just that option and try again.
(Thanks for including your exact git version, that will help with fixing things.)
On Thu., Jul. 21, 2022, 11:19 a.m. Rene Labounek, @.***> wrote:
-
Start the automatic merging process, which we expect to fail (we'll fix this with the following commands).
git checkout git-annex git merge --no-edit --allow-unrelated-histories upstream/git-annex
We expect this to produce lots of output, ending with:
... Automatic merge failed; fix conflicts and then commit the result.
My expected outputs started to appear different within the first set of commands. My git does not know allow-unrelated-histories. Should I continue? Or do we need to fix this first?
(base) @. data-multi-subject]$ git checkout git-annex Switched to branch 'git-annex' Your branch is ahead of 'upstream/git-annex' by 5 commits. (use "git push" to publish your local commits) (base) @. data-multi-subject]$
(base) @.*** data-multi-subject]$ git merge --no-edit --allow-unrelated-histories upstream/git-annex error: unknown option `allow-unrelated-histories' usage: git merge [options] [
...] or: git merge [options] HEAD or: git merge --abort -n do not show a diffstat at the end of the merge --stat show a diffstat at the end of the merge --summary (synonym to --stat) --log[=<n>] add (at most <n>) entries from shortlog to merge commit message --squash create a single commit instead of doing a merge --commit perform a commit if the merge succeeds (default) -e, --edit edit message before committing --ff allow fast-forward (default) --ff-only abort if fast-forward is not possible --rerere-autoupdate update the index with reused conflict resolution if possible --verify-signatures Verify that the named commit has a valid GPG signature -s, --strategy <strategy> merge strategy to use -X, --strategy-option <option=value> option for selected merge strategy -m, --message <message> merge commit message (for a non-fast-forward merge) -v, --verbose be more verbose -q, --quiet be more quiet --abort abort the current in-progress merge --progress force progress reporting -S, --gpg-sign[=<key id>] GPG sign commit --overwrite-ignore update ignored files (default)
(base) @.*** data-multi-subject]$
I assume it is a git version issue https://stackoverflow.com/questions/41356766/unknown-option-allow-unrelated-histories. Unfortunatelly, I am not admin at the server where the database is stored. So update git can be tricky or even impossible for me. Any other possibility before I will write to server admins the update request?
(base) @. data-multi-subject]$ git --version git version 1.8.3.1 (base) @. data-multi-subject]$
— Reply to this email directly, view it on GitHub https://github.com/spine-generic/data-multi-subject/issues/125#issuecomment-1191619144, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHCXZWTZRZMNTAWV6XNNTLVVFS7XANCNFSM53TMY4DA . You are receiving this because you were assigned.Message ID: @.***>
First two commands appears that everythin is up to date. The third command does not know flag --continue
. Possibly, again, version issue.
(base) [labounek@porto data-multi-subject]$ git merge --no-edit upstream/git-annex
Already up-to-date.
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git -c 'mergetool.custom.cmd=(echo "$MERGED" | grep -Eq [.]log$) && cat "$LOCAL" "$REMOTE" | sort -nu > "$MERGED"' \
> -c mergetool.custom.trustExitCode \
> -c mergetool.keepBackup=false \
> mergetool --tool=custom
No files need merging
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git -c core.editor=/bin/true merge --continue
error: unknown option `continue'
usage: git merge [options] [<commit>...]
or: git merge [options] <msg> HEAD <commit>
or: git merge --abort
-n do not show a diffstat at the end of the merge
--stat show a diffstat at the end of the merge
--summary (synonym to --stat)
--log[=<n>] add (at most <n>) entries from shortlog to merge commit message
--squash create a single commit instead of doing a merge
--commit perform a commit if the merge succeeds (default)
-e, --edit edit message before committing
--ff allow fast-forward (default)
--ff-only abort if fast-forward is not possible
--rerere-autoupdate update the index with reused conflict resolution if possible
--verify-signatures Verify that the named commit has a valid GPG signature
-s, --strategy <strategy>
merge strategy to use
-X, --strategy-option <option=value>
option for selected merge strategy
-m, --message <message>
merge commit message (for a non-fast-forward merge)
-v, --verbose be more verbose
-q, --quiet be more quiet
--abort abort the current in-progress merge
--progress force progress reporting
-S, --gpg-sign[=<key id>]
GPG sign commit
--overwrite-ignore update ignored files (default)
(base) [labounek@porto data-multi-subject]$
From the output of your commands, it looks like the git-annex
branch on
your computer already contains all the information from both
origin/git-annex
and upstream/git-annex
, with no conflicts. So, the
first command git merge
correctly and successfully did nothing (but I was
expecting a merge conflict); the second command also saw no conflicts and
did nothing; and the third command git merge --continue
didn't recognize
the --continue
flag (probably a git version issue, yes), but in this
situation that command didn't need to do anything so that's ok.
Can you try the next commands? And whether or not it succeeds, if you can do the last command (for saving the git-annex state to your fork on github), at least I should be able to examine the state myself to figure out what's going on.
On Thu., Jul. 21, 2022, 11:49 a.m. Rene Labounek, @.***> wrote:
First two commands appears that everythin is up to date. The third command does not know flag --continue. Possibly, again, version issue.
(base) @. data-multi-subject]$ git merge --no-edit upstream/git-annex Already up-to-date. (base) @. data-multi-subject]$
(base) @.*** data-multi-subject]$ git -c 'mergetool.custom.cmd=(echo "$MERGED" | grep -Eq [.]log$) && cat "$LOCAL" "$REMOTE" | sort -nu > "$MERGED"' \
-c mergetool.custom.trustExitCode \ -c mergetool.keepBackup=false \ mergetool --tool=custom
No files need merging (base) @.*** data-multi-subject]$
(base) @.*** data-multi-subject]$ git -c core.editor=/bin/true merge --continue error: unknown option `continue' usage: git merge [options] [
...] or: git merge [options] HEAD or: git merge --abort -n do not show a diffstat at the end of the merge --stat show a diffstat at the end of the merge --summary (synonym to --stat) --log[=<n>] add (at most <n>) entries from shortlog to merge commit message --squash create a single commit instead of doing a merge --commit perform a commit if the merge succeeds (default) -e, --edit edit message before committing --ff allow fast-forward (default) --ff-only abort if fast-forward is not possible --rerere-autoupdate update the index with reused conflict resolution if possible --verify-signatures Verify that the named commit has a valid GPG signature -s, --strategy <strategy> merge strategy to use -X, --strategy-option <option=value> option for selected merge strategy -m, --message <message> merge commit message (for a non-fast-forward merge) -v, --verbose be more verbose -q, --quiet be more quiet --abort abort the current in-progress merge --progress force progress reporting -S, --gpg-sign[=<key id>] GPG sign commit --overwrite-ignore update ignored files (default)
(base) @.*** data-multi-subject]$
— Reply to this email directly, view it on GitHub https://github.com/spine-generic/data-multi-subject/issues/125#issuecomment-1191654317, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHCXZWY7ZNXPSPFNARWOTDVVFWQFANCNFSM53TMY4DA . You are receiving this because you were assigned.Message ID: @.***>
@mguaypaq, sorry for ym delay. I was not around computer much through the weekend. File download failed again. Bellow are the outputs. Thanks, Rene
(base) [labounek@porto data-multi-subject]$ git checkout master
Checking out files: 100% (7613/7613), done.
Switched to branch 'master'
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
whereis sub-amu01/anat/sub-amu01_T1w.nii.gz (0 copies) failed
whereis: 1 failed
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex version
git-annex version: 8.20211012-geb95ed486
build flags: Assistant Webapp Pairing Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite S3 WebDAV
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.26 DAV-1.3.4 feed-1.3.0.1 ghc-8.8.4 http-client-0.6.4.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs httpalso borg hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8
(base) [labounek@porto data-multi-subject]$
Ah, examining the previous output you gave me more carefully, I see that git fetch upstream git-annex
did not do exactly what I expected, probably because of the older git version. I'm sorry. Can you try the following command:
git fetch upstream
After that, I would be curious if the following command now works (and the output):
git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
If it fails, the output of the following commands could help me figure out what's going on:
git status
git show master:sub-amu01/anat/sub-amu01_T1w.nii.gz
git show git-annex:8db/012/SHA256E-s15565632--20fa8ec26515317c0871c129300c3ca6a44a20b8c48275e07ba46a0b9d22210a.nii.gz.log
git show origin/git-annex:8db/012/SHA256E-s15565632--20fa8ec26515317c0871c129300c3ca6a44a20b8c48275e07ba46a0b9d22210a.nii.gz.log
git show upstream/git-annex:8db/012/SHA256E-s15565632--20fa8ec26515317c0871c129300c3ca6a44a20b8c48275e07ba46a0b9d22210a.nii.gz.log
This really helped. See outputs bellow. Sorry for the hessle. But in summary, what was generally different for the forked database when compared to the README.md method? Just to start with the git fetch upstream
before git pull && git annex get .
. Or was there something else needed to be done?
@mguaypaq, thank you very much for your time and help.
Rene
(base) [labounek@porto data-multi-subject]$ git fetch upstream
remote: Enumerating objects: 63657, done.
remote: Counting objects: 100% (13269/13269), done.
remote: Compressing objects: 100% (11272/11272), done.
remote: Total 63657 (delta 1165), reused 13254 (delta 1159), pack-reused 50388
Receiving objects: 100% (63657/63657), 9.57 MiB | 2.91 MiB/s, done.
Resolving deltas: 100% (20171/20171), completed with 271 local objects.
From https://github.com/spine-generic/data-multi-subject
* [new branch] af/fix_images -> upstream/af/fix_images
* [new branch] af/update_participants_tsv -> upstream/af/update_participants_tsv
* [new branch] af/update_sub-ucl06_GH_direct -> upstream/af/update_sub-ucl06_GH_direct
+ e40f90c...15efe2d git-annex -> upstream/git-annex (forced update)
* [new branch] master -> upstream/master
* [new branch] mt-entity -> upstream/mt-entity
* [new branch] ng/ci-cache -> upstream/ng/ci-cache
* [new branch] renelabounek-master -> upstream/renelabounek-master
* [new branch] rl/height-weight -> upstream/rl/height-weight
* [new branch] sb/121-move-derivatives -> upstream/sb/121-move-derivatives
* [new branch] synced/af/csfseg_json -> upstream/synced/af/csfseg_json
* [new branch] synced/jca/17-defacing -> upstream/synced/jca/17-defacing
* [new branch] synced/jca/21-travis -> upstream/synced/jca/21-travis
* [new branch] synced/jca/7-check-data-consistency -> upstream/synced/jca/7-check-data-consistency
* [new branch] synced/jca/derivatives-r20200830-mts -> upstream/synced/jca/derivatives-r20200830-mts
* [new branch] synced/jca/participants -> upstream/synced/jca/participants
* [new branch] synced/jca/readme -> upstream/synced/jca/readme
* [new branch] synced/jca/reupload-disc-labels -> upstream/synced/jca/reupload-disc-labels
* [new branch] synced/jca/update-derivatives -> upstream/synced/jca/update-derivatives
* [new tag] r20201001 -> r20201001
* [new tag] r20201130 -> r20201130
* [new tag] r20211125 -> r20211125
* [new tag] r20220125 -> r20220125
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex whereis sub-amu01/anat/sub-amu01_T1w.nii.gz
(merging upstream/git-annex into git-annex...)
(recording state in git...)
whereis sub-amu01/anat/sub-amu01_T1w.nii.gz (3 copies)
5a5447a8-a9b8-49bc-8276-01a62632b502 -- [amazon]
e405e14e-33b2-4a35-b7a7-3eeec054f0d4 -- sebeda@joplin.neuro.polymtl.ca:/mnt/nvme/sebeda/data-multi-subject
fc75435d-eb11-4c5a-9b68-debf6e68df2a -- alex@MacBook.local:~/data/data-multi-subject
amazon: https://data-multi-subject---spine-generic---neuropoly.s3.ca-central-1.amazonaws.com/SHA256E-s15565632--20fa8ec26515317c0871c129300c3ca6a44a20b8c48275e07ba46a0b9d22210a.nii.gz
ok
(base) [labounek@porto data-multi-subject]$
(base) [labounek@porto data-multi-subject]$ git annex get sub-amu01/anat/sub-amu01_T1w.nii.gz
get sub-amu01/anat/sub-amu01_T1w.nii.gz (from amazon...)
ok
(recording state in git...)
(base) [labounek@porto data-multi-subject]$
Great! I'm so glad it works now :tada:
But in summary, what was generally different for the forked database when compared to the README.md method? Just to start with the git fetch upstream before git pull && git annex get .. Or was there something else needed to be done?
I think that's the main thing, yes, just doing git fetch upstream
along with git pull
before doing git annex get
. The reason is:
git pull
gets the contents of small files and hashes of the contents of big files into your local master
branch (on your computer) from the origin/master
branch (on your github fork). Possibly, depending on your settings, it also gets this from the upstream/master
branch (on the original github repo), but typically your fork's origin/master
is kept up to date with upstream/master
so it doesn't matter.git fetch upstream
gets the locations of the contents of big files (for example, "the file contents that hashes to 12FA32 is available on amazon") into your local git-annex
branch, from the upstream/git-annex
branch. This was the missing part in your case. Github makes it easy to synchronize your fork's origin/master
branch with upstream/master
, but it doesn't do this by default for other branches, like origin/git-annex
and upstream/git-annex
. In fact, your fork's origin/git-annex
branch is very much older than upstream/git-annex
, so it doesn't have information about the location of newer file contents.git annex get
looks at the hash of a file contents in master
, uses this to look up a location in git-annex
, and contacts this location (amazon, in this case) to get the actual contents.Hopefully that explains this a bit!
A bit, yes... :-D I have the data updated, so we can close this issue. Only question remain, if my case was just too specific or if it would be worth to add the potential neccesity of the git fetch upstream
command into README.md
Indeed! I'll open a separate issue for that. Thanks again for your patience!
I fetched upstream spine-generic:data-multi-subject into https://github.com/renelabounek/data-multi-subject. Then I called
and have got following errors saying me that re-defaced T1w images were not downloaded throught the git annex: