Open mlell opened 1 year ago
Thanks for the report! I gave your reproducer (thanks much!) a quick try. I can't seem to reproduce it with a more recent datalad.
(handbook) adina@muninn in /tmp
❱ datalad create a
create(ok): /tmp/a (dataset)
(handbook) adina@muninn in /tmp
❱ cd a
(handbook) adina@muninn in /tmp/a on git:master
❱ datalad create-sibling-ria -s origin ria+file://$PWD/../ria
create-sibling-ria(error): /tmp/a (dataset) [No store found at '/tmp/a/../ria'. Forgot --new-store-ok ?]
(handbook) adina@muninn in /tmp/a on git:master
❱ datalad create-sibling-ria -s origin --new-store-ok ria+file://$PWD/../ria
[INFO ] create siblings 'origin' and 'origin-storage' ...
[INFO ] Fetching updates for Dataset(/tmp/a)
update(ok): . (dataset)
update(ok): . (dataset)
[INFO ] Configure additional publication dependency on "origin-storage"
configure-sibling(ok): . (sibling)
create-sibling-ria(ok): /tmp/a (dataset)
action summary:
configure-sibling (ok: 1)
create-sibling-ria (ok: 1)
update (ok: 1)
0.00 [00:01, ?/s] (handbook) adina@muninn in /tmp/a on git:master
❱ cd ..
(handbook) adina@muninn in /tmp
❱ datalad clone a b
[INFO ] Fetching updates for Dataset(/tmp/b)
update(ok): . (dataset)
update(ok): . (dataset)
configure-sibling(ok): . (sibling)
install(ok): /tmp/b (dataset)
action summary:
configure-sibling (ok: 1)
install (ok: 1)
update (ok: 1)
(handbook) adina@muninn in /tmp
❱ cd b
(handbook) adina@muninn in /tmp/b on git:master
❱ git remote -v
origin ../a (fetch)
origin ../a (push)
origin-2 /tmp/a/../ria/e21/e1696-4462-4c84-be22-eac41fbc6279 (fetch)
origin-2 /tmp/a/../ria/e21/e1696-4462-4c84-be22-eac41fbc6279 (push)
origin-storage
(handbook) adina@muninn in /tmp/b on git:master
❱ datalad wtf -S datalad -S git-annex
# WTF
## datalad
- version: 0.18.2+16.gaa7170e0a
## git-annex
- build flags:
- Assistant
- Webapp
- Pairing
- Inotify
- DBus
- DesktopNotify
- TorrentParser
- MagicMime
- Benchmark
- Feeds
- Testsuite
- S3
- WebDAV
- dependency versions:
- aws-0.22.1
- bloomfilter-2.0.1.0
- cryptonite-0.29
- DAV-1.3.4
- feed-1.3.2.1
- ghc-9.0.2
- http-client-0.7.13.1
- persistent-sqlite-2.13.1.0
- torrent-10000.1.1
- uuid-1.3.15
- yesod-1.6.2.1
- key/value backends:
- SHA256E
- SHA256
- SHA512E
- SHA512
- SHA224E
- SHA224
- SHA384E
- SHA384
- SHA3_256E
- SHA3_256
- SHA3_512E
- SHA3_512
- SHA3_224E
- SHA3_224
- SHA3_384E
- SHA3_384
- SKEIN256E
- SKEIN256
- SKEIN512E
- SKEIN512
- BLAKE2B256E
- BLAKE2B256
- BLAKE2B512E
- BLAKE2B512
- BLAKE2B160E
- BLAKE2B160
- BLAKE2B224E
- BLAKE2B224
- BLAKE2B384E
- BLAKE2B384
- BLAKE2BP512E
- BLAKE2BP512
- BLAKE2S256E
- BLAKE2S256
- BLAKE2S160E
- BLAKE2S160
- BLAKE2S224E
- BLAKE2S224
- BLAKE2SP256E
- BLAKE2SP256
- BLAKE2SP224E
- BLAKE2SP224
- SHA1E
- SHA1
- MD5E
- MD5
- WORM
- URL
- X*
- local repository version: 10
- operating system: linux x86_64
- remote types:
- git
- gcrypt
- p2p
- S3
- bup
- directory
- rsync
- web
- bittorrent
- webdav
- adb
- tahoe
- glacier
- ddar
- git-lfs
- httpalso
- borg
- hook
- external
- supported repository versions:
- 8
- 9
- 10
- upgrade supported from repository versions:
- 0
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- version: 10.20221003
Could you retry after updating the tools?
Thank you for this hint! I just installed datalad but missed the fact that the datalad version was restricted by our server's ancient python version. I have upgraded to Python 3.7.3 (the latest that I could compile myself easily on our cluster) and datalad 0.18.2
Indeed the problem is largely gone and the "origin" of repo A is renamed:
$ datalad siblings
.: here(+) [git]
.: origin-2(-) [(omitted)/../ria/5d0/493e1-44eb-42eb-8777-29b0ea6e5b43 (git)]
.: origin-storage(+) [ora]
.: origin(+) [../a (git)]
There remain only two things:
origin-storage
would need to be called origin-2-storage
as well. origin-storage
on origin-2
is not inherited, so datalad push --to origin-2
would not upload annexed data$ git -C a config --list | grep ^remote
remote.origin-storage.annex-externaltype=ora
remote.origin-storage.annex-uuid=43dcfcde-2c1a-4621-99ff-6a5464297255
remote.origin-storage.skipfetchall=true # <<< this is not inherited on cloning ===========
remote.origin-storage.annex-cost=100.0
remote.origin-storage.annex-availability=GloballyAvailable
remote.origin.annex-ignore=true # <<< this is not inherited on cloning ===========
remote.origin.url=/data/lell/test/datalad-chain/a/../ria/5d0/493e1-
44eb-42eb-8777-29b0ea6e5b43
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.datalad-publish-depends=origin-storage # <<< this is not inherited on cloning ===========
$ git -C b config --list | grep ^remote
remote.origin.url=../a
remote.origin.fetch=+refs/heads/*:refs/remotes/origin/*
remote.origin.annex-uuid=b1a84354-0031-4c73-a608-ef9997a93234
remote.origin-storage.annex-externaltype=ora
remote.origin-storage.annex-uuid=43dcfcde-2c1a-4621-99ff-6a5464297255
remote.origin-storage.annex-cost=100.0
remote.origin-storage.annex-availability=GloballyAvailable
remote.origin-2.url=/data/lell/test/datalad-chain/a/../ria/5d0/493e1-44eb-42eb-8777-29b0ea6e5b43
remote.origin-2.fetch=+refs/heads/*:refs/remotes/origin-2/*
remote.origin-2.annex-ignore=false
Test for publication dependency problem:
$ cd b
$ echo "test" > x
$ datalad save
$ datalad push --to origin-2
publish(ok): . (dataset) [refs/heads/master->origin-2:refs/heads/master [new branch]]
publish(ok): . (dataset) [refs/heads/git-annex->origin-2:refs/heads/git-annex [new branch]]
action summary:
copy (notneeded: 1)
publish (ok: 2)
$ git annex find --not --in origin-storage
remote origin-2:This repository is not initialized for use by git-annex, but /qg-10/data/AGR-QG/lell/test/datalad-chain/a/../ria/5d0/493e1-44eb-42eb-8777-29b0ea6e5b43/annex/objects/ exists, which indicates this repository was used by git-annex before, and may have lost its annex.uuid and annex.version configs. Either set back missing configs, or run git-annex init to initialize with a new uuid.
x
(relevant of the last command is the x
at the very end, indicating that the file x
was not uploaded to the RIA store annex. The warning before that might come from the config remote.origin-2.annex-ignore
not being inherited from the a
repo as well)
Comparing with the case where we first push x from b to "origin" (->a) and then from a to "origin"(->ria), the ORA sibling of the RIA store is updated
$cd a
$git config receive.denyCurrentBranch updateInstead
$cd ../b
$datalad push --to origin
$cd ../a
$datalad push --to origin
$git annex find --not --in origin-storage
# -- no output, so x is uploaded --
What is the problem?
When a dataset is cloned from a RIA store and that clone is cloned again, the storage sibling of the RIA store is not named correctly and probably cannot be enabled... I think this is because datalad does not expect that git annex auto-renames remotes if the remote has itself a remote of the same name.
What steps will reproduce the problem?
DataLad information
Additional context
No response
Have you had any success using DataLad before?
No response