dandi / backups2datalad

Mirror Dandisets as git-annex repositories
MIT License
1 stars 0 forks source link

400 response while backing up a zarr #11

Closed yarikoptic closed 9 months ago

yarikoptic commented 9 months ago

A multi-weak backup of 000108 finally finished with a crash:

> flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron.lock bash -x '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron-zarrs'
...
2023-12-16T21:51:14-0500 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000108/draft>:
  + Exception Group Traceback (most recent call last):
  |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 174, in dowork
  |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 138, in update_dandiset
  |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 183, in sync_dataset
  |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/syncer.py", line 35, in sync_assets
  |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/asyncer.py", line 517, in async_assets
  |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 664, in __aexit__
  |     raise BaseExceptionGroup(
  | exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
  +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/zarr.py", line 588, in sync_zarr
    |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/manager.py", line 94, in set_zarr_description
    |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/manager.py", line 50, in _set_github_description
    |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/manager.py", line 121, in edit_repo
    |   File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 130, in arequest
    |   File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/httpx/_models.py", line 758, in raise_for_status
    |     raise HTTPStatusError(message, request=request, response=self)
    | httpx.HTTPStatusError: Client error '400 Bad Request' for url 'https://api.github.com/repos/dandizarrs/cf33b1a9-3c3c-4676-8ff7-b14c2762de4d'
    | For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/400
    +------------------------------------

000108 was left in a dirty state with many new .json files staged, and updated assets.json including those .json files and corresponding .zarr submodules although none of submodules yet added (not necessarily a bug , just describing the behaviour ATM)

dandi@drogon:/mnt/backup/dandi/dandisets/000108$ git status sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM*
On branch draft
Your branch is up to date with 'github/draft'.

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM.json

dandi@drogon:/mnt/backup/dandi/dandisets/000108$ grep sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM .gitmodules .git/config .dandi/assets.json
.dandi/assets.json:"sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM.json",
.dandi/assets.json:"sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM.json",
.dandi/assets.json:"sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM.ome.zarr",
.dandi/assets.json:"sub-U01hm15x/ses-20220907h15m24s27/micr/sub-U01hm15x_ses-20220907h15m24s27_sample-mEhm11206x15R5_YO_stain-YO_run-1_chunk-1_SPIM.ome.zarr",

I guess it would be valuable to figure out what that 400 could have been due to... may be similarly to the issue encountered in datalad-installer due to some recent changes in github behavior?

edit: note: 000108 still did not get its git status updated since July since it kept too long to update and then crashing for one reason or another

dandi@drogon:/mnt/backup/dandi/dandisets$ git -C 000108 log | head -n 100
commit 2024ef04940f0c9a25b8ac9d87b9b39c2e66c4a0
Author: DANDI User <info@dandiarchive.org>
Date:   Thu Jul 27 15:06:39 2023 +0000

    [backups2datalad] Only some metadata updates

commit 1b669091c0ff9e2c7dcaebc5d35b4f0999985b2b
Author: DANDI User <info@dandiarchive.org>
Date:   Tue Jun 20 00:56:26 2023 +0000

    [backups2datalad] Only some metadata updates
jwodder commented 9 months ago

Possible causes I can think of, in no particular order:

may be similarly to the issue encountered in datalad-installer due to some recent changes in github behavior?

No, that involved cross-origin redirects, which updates to repository metadata should not be responding with.

yarikoptic commented 9 months ago

ok, since this is AFAIK a first occurrence let's consider nothing to be done on our end and I will