dandi / backups2datalad

Mirror Dandisets as git-annex repositories
MIT License
1 stars 0 forks source link

Uncommitted 000977 backup in `/mnt/backup/dandi/dandisets` on drogon #44

Closed jwodder closed 5 months ago

jwodder commented 5 months ago

The /mnt/backup/dandi/dandisets superdataset on drogon currently contains a 000977/ submodule (000977 being an embargoed Dandiset) that is not tracked by the superdataset. Should something be done about this?

CC @yarikoptic

yarikoptic commented 5 months ago

ideally troubleshooted on how that could have happened and prevented from happening in the future.

the initial email pointing to sequence of failing attempts ```shell Date: Tue, 30 Apr 2024 18:02:51 -0400 From: Cron Daemon To: dandi@drogon.datalad.org Subject: Cron chronic flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron-nonzarr.lock bash -c '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron' 2024-04-30T18:02:48-0400 [WARNING ] backups2datalad: Failed [rc=1]: git -c receive.autogc=0 -c gc.auto=0 annex initremote dandi-dandisets-dropbox type=external externaltype=rclone chunk=1GB target=dandi-dandisets-dropbox prefix=dandi-dandisets/annexstore embedcreds=no uuid=727f466f-60c3-4778-90b2-b2332856c2f8 encryption=none [cwd=/mnt/backup/dandi/dandisets/000977] Stdout: initremote dandi-dandisets-dropbox failed Stderr: 2024/04/30 18:02:36 ERROR : Attempt 1/3 failed with 1 errors and: invalid character '<' looking for beginning of value 2024/04/30 18:02:47 ERROR : Attempt 2/3 failed with 1 errors and: invalid character '<' looking for beginning of value 2024/04/30 18:02:48 ERROR : Attempt 3/3 failed with 1 errors and: invalid character '<' looking for beginning of value 2024/04/30 18:02:48 Failed to mkdir: invalid character '<' looking for beginning of value git-annex: Failed to create directory on remote. Ensure that 'rclone config' has been run. initremote: 1 failed 2024-04-30T18:02:48-0400 [ERROR ] backups2datalad: Job failed on input : Traceback (most recent call last): File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/aioutil.py", line 177, in dowork outp = await func(inp) File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 136, in update_dandiset ds = await self.init_dataset( File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 108, in init_dataset await ds.ensure_installed( File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/adataset.py", line 97, in ensure_installed await self.call_annex( File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/adataset.py", line 237, in call_annex await aruncmd("git", *GIT_OPTIONS, "annex", *args, cwd=self.path, **kwargs) File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/aioutil.py", line 224, in aruncmd raise e File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/aioutil.py", line 206, in aruncmd r = await anyio.run_process(argstrs, **kwargs) File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_core/_subprocesses.py", line 85, in run_process raise CalledProcessError(cast(int, process.returncode), command, output, errors) subprocess.CalledProcessError: Command '['git', '-c', 'receive.autogc=0', '-c', 'gc.auto=0', 'annex', 'initremote', 'dandi-dandisets-dropbox', 'type=external', 'externaltype=rclone', 'chunk=1GB', 'target=dandi-dandisets-dropbox', 'prefix=dandi-dandisets/annexstore', 'embedcreds=no', 'uuid=727f466f-60c3-4778-90b2-b2332856c2f8', 'encryption=none']' returned non-zero exit status 1. 2024-04-30T18:02:50-0400 [ERROR ] backups2datalad: An error occurred: Traceback (most recent call last): File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/__main__.py", line 119, in wrapped await f(datasetter, *args, **kwargs) File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/__main__.py", line 228, in update_from_backup await datasetter.update_from_backup(dandisets, exclude=exclude) File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 94, in update_from_backup raise RuntimeError( RuntimeError: Backups for 1 Dandiset failed Logs saved to /mnt/backup/dandi/dandisets/.git/dandi/backups2datalad/2024.04.30.22.02.11Z.log action summary: publish (notneeded: 2) publish(ok): . (dataset) [refs/heads/draft->github:refs/heads/draft 669eadf..3769aee] publish(ok): . (dataset) [refs/heads/git-annex->github:refs/heads/git-annex c3ae75e..c374f1e] action summary: publish (ok: 2) ```

so must be some rclone command failed and then html was channeled into git-annex or smth like that :-/

We already have https://github.com/dandisets/000977 but I think it would be ok to

jwodder commented 5 months ago

@yarikoptic I have renamed the GitHub repository to 000977.orig and moved the backup directory on drogon to /mnt/backup/dandi/000977.orig. The next run of the backup script should fix things.

jwodder commented 5 months ago

@yarikoptic The 000977 backup has now been recreated and is committed to the superdataset this time. May I delete the *.orig backup-backups?

jwodder commented 5 months ago

@yarikoptic Ping.

yarikoptic commented 5 months ago

Thank you @jwodder ! yes -- please remove them and close the issue with that.

jwodder commented 5 months ago

000977.orig backups deleted.