Open mih opened 1 year ago
This had a follow up in the office hour chat and the office hour today. Out of multiple subdatasets, most were pushed to the RIA without an issue, but two did not:
Push to 'ria-backup':
CommandError: 'git -c diff.ignoreSubmodules=none annex copy --batch -z --to ria-backup-storage --fast --json --json-error-messages --json-progress -c annex.dotfiles=true' failed with exitcode 1 under /media/(--redacted--) [info keys: stdout_json]
> to ria-backup-storage...
content changed while it was being sent
This could have failed because --fast is enabled. [733 times]
git-annex: copy: 733 failed
There was an issue about the same error https://github.com/datalad/datalad/issues/5613 which was solved by upgrading git-annex (to 10.20220128) - since the current issue was reported using an older version, we need to wait and see if an update solves the problem.
IMO there are two issues here:
content changed while it was being sent
issue, which is has a thorough writeup in the form of https://github.com/datalad/datalad/issues/5613 and which ideally also solves this user's problem (UPDATE: user has confirmed that upgrading to the latest version of git-annex solved the problem)Paraphrased steps followed by the user:
ria-backup
, alias: ria-alias
) on the external hard-drive (recursively, since the superdataset has nested subdatasets): datalad create-sibling-ria -s ria-backup --alias ria-alias --new-store-ok ria+file:///<path-to-location-on-external-hard-drive> -r
datalad push --to ria-backup -r
datalad get <relative-location-to-subdataset>
datalad push --to ria-backup
datalad update --merge
Most of the above is explained in https://handbook.datalad.org/en/latest/beyond_basics/101-147-riastores.html, but I think this compact use case can still stand on its own as a KBI.
More traffic on this issue today. datalad status
generated the error: Unknown commit identifier: master
was generated. Asked follow up questions to the OP, no answer yet.
I have a windows machine and will spend some time looking into ria-stores on NTFS
More traffic on this issue today. datalad status generated the error: Unknown commit identifier: master was generated. Asked follow up questions to the OP, no answer yet.
User reported that this was no longer an issue for them (they didn't have to use that solution anymore, and they won't be spending time debugging it anymore). So for the purpose of solving the user's problem, this issue is not needed anymore. But for the purpose of writing a KBI, this issue can remain open, pending a test on a windows system and the KBI writeup.
Origin: DataLad office hour chat 2023-05-08
While performing a recursive
get
of a superdataset clone (multiple subdatasets) onto a crippledFS external harddrive, the user aborted the command and was left of modified dataset clones.TODO (not necessarily to be performed in this order)
Capturing relevant pieces from my reply:
Instead of getting a nested hierarchy of a single version snapshot of your data, it would actually be a full backup (all data, all versions), and it would not suffer from the limitations of your hard-drive file system as much (unverified speculation).
The downside is that it won't look as pretty
But this is our standard solution for collaboration (push/pull) using a location that is not ready for git-annex
if you like papers more than online handbooks: https://doi.org/10.1038/s41597-022-01163-2
Roughly summarizing the difference between what you tried and what this different approach would mean:
This means you will work exclusively in your main dataset clone.
The resulting "RIA store" on the harddrive, can be added to other existing clones as a remote, and they will be able to pull data from it. You would be able to continue to push data (new versions) onto the drive, without having to replace/delete anything (until you run out of space) (At which point you can detect and cleanup versions you no longer need).
RIA stores also support compressed archives -- so your harddrive might last for quite a bit
CAUTION: I am not aware of anyone having actually tried putting a RIA store on an external harddrive with a non-POSIX filesystem. I expect this to work, but there is no hard evidence for this claim.