Open kousu opened 3 years ago
Maybe sync
is okay, but only if we confine it to the git-annex
branch. There's an option to do this:
git config --global annex.synconlyannex true
Then uploading would become:
git push # sync .git/objects
git annex sync # sync .git/annex/objects + git-annex branch to track them
Another alternative, I think, maybe, is remote.<remote>.annex-speculate-present
. This should skip the need for daily updates to the git-annex
branch ( https://github.com/neuropoly/data-management/issues/67#issuecomment-819051308 ).
I discovered a reason we have to recommend it: you need to run it once to initialize each remote's annexuuid:
[kousu@requiem data-single-subject]$ git annex copy --to=praxis-gin
(scanning for unlocked files...)
Enter passphrase for key '/home/kousu/.ssh/id_ed25519.neuropoly':
Enter passphrase for key '/home/kousu/.ssh/id_ed25519.neuropoly':
git-annex: cannot determine uuid for praxis-gin (perhaps you need to run "git annex sync"?)
I'm not sure but I would guess this is only an issue with common ssh remotes; I know with Amazon remotes, the annex uuid is created as a specially named file in the S3 bucket.
Apparently I missed removing it entirely from #97, which is lucky because currently: https://github.com/neuropoly/data-management/blob/master/internal-server.md#new-repository recommends
$ git remote add origin git@data.neuro.polymtl.ca:datasets/my-new-repo
$ git annex sync --content origin
which means the annex uuid will get created safely.
I don't know what to do about this. git-annex is impossible.
The explicit git push git-annex:git-annex
doesn't work, because the remote git-annex
(sometimes) makes its own merge commits leading to
! [rejected] git-annex -> git-annex (non-fast-forward)
It's okay if you're working alone, but as soon as you're collaborating you need something to handle merging the branches.
So we must use some form of git annex sync
. Just, probably not its default form.
git annex sync
has a UI inconsistent with the rest of git. It is omnivorous by default: syncing bidirectionally and with all remotes, in the process it creates a plethora ofsynced/*
branches as a workaround for a weakness in git, and it touches all branches not just the working branch and this has performance problems (e.g. https://github.com/neuropoly/data-management/issues/26) as well as causing confusion and bugs when trying to work with a pull-request workflow.The basic problem, IMO, is that the
git-annex
metadata is kept in a separate branch shared by all branches, instead of being kept in, say, a hidden .annex subfolder as part of each branch, which means the usual git algorithm can't handle it.Some alternatives:
git annex sync --only-annex --content; git push/git pull
But I'm unsure about
pull
;git pull origin $(git branch --show-current):$(git branch --show-current) git-annex:git-annex
doesn't work; it tries to mergemaster
withgit-annex
which is not good; possibly thisgit fetch origin $(git branch --show-current):$(git branch --show-current) git-annex:git-annex
?