dandi / dandisets

737 Dandisets, 812.2 TB total. DataLad super-dataset of all Dandisets from https://github.com/dandisets
10 stars 0 forks source link

update-from-backup --workers 5 crashed #208

Closed yarikoptic closed 2 years ago

yarikoptic commented 2 years ago

didn't investigate if relates to fresh merge of #205 (--workers) but having merged, added --workers 5 and trying to run -- got a total meltdown

(base) dandi@drogon:/mnt/backup/dandi/dandisets$ flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron.lock bash -c '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron'
>> python -m tools.backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup --workers 5 -e '000108$'
2022-06-16T19:11:27-0400 [WARNING ] dandi: A newer version (0.40.1) of dandi/dandi-cli is available. You are using 0.40.0
action summary:
  publish (notneeded: 2)
error: unknown option `format=%(objecttype):%(objectsize):%(path)'
usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display object names

2022-06-16T19:11:40-0400 [WARNING ] backups2datalad: Command `git ls-tree -r '--format=%(objecttype):%(objectsize):%(path)' -z HEAD` [cwd=/mnt/backup/dandi/dandisets/000006] exited with return code 129
2022-06-16T19:11:40-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000006/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 204, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 127, in update_dandiset
    return await self.set_dandiset_gh_metadata(dandiset, ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 238, in set_dandiset_gh_metadata
    stats, zarrstats = await self.get_dandiset_stats(ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 213, in get_dandiset_stats
    for filestat in await ds.get_file_stats():
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/adataset.py", line 178, in get_file_stats
    filedict[path] = replace(filedict[path], size=int(data["bytesize"]))
KeyError: 'sub-anm369962/sub-anm369962_ses-20170309.nwb'
action summary:
  publish (notneeded: 2)
error: unknown option `format=%(objecttype):%(objectsize):%(path)'
usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display object names

2022-06-16T19:11:40-0400 [WARNING ] backups2datalad: Command `git ls-tree -r '--format=%(objecttype):%(objectsize):%(path)' -z HEAD` [cwd=/mnt/backup/dandi/dandisets/000007] exited with return code 129
2022-06-16T19:11:40-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000007/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 204, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 127, in update_dandiset
    return await self.set_dandiset_gh_metadata(dandiset, ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 238, in set_dandiset_gh_metadata
    stats, zarrstats = await self.get_dandiset_stats(ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 213, in get_dandiset_stats
    for filestat in await ds.get_file_stats():
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/adataset.py", line 178, in get_file_stats
    filedict[path] = replace(filedict[path], size=int(data["bytesize"]))
KeyError: 'sub-BAYLORCD12/sub-BAYLORCD12_ses-20180125T191601.nwb'
action summary:
  publish (notneeded: 2)
action summary:
  publish (notneeded: 2)
action summary:
  publish (notneeded: 2)
error: unknown option `format=%(objecttype):%(objectsize):%(path)'
usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display object names

error: unknown option `format=%(objecttype):%(objectsize):%(path)'
usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display object names

error: unknown option `format=%(objecttype):%(objectsize):%(path)'
usage: git ls-tree [<options>] <tree-ish> [<path>...]

    -d                    only show trees
    -r                    recurse into subtrees
    -t                    show trees when recursing
    -z                    terminate entries with NUL byte
    -l, --long            include object size
    --name-only           list only filenames
    --name-status         list only filenames
    --full-name           use full path names
    --full-tree           list entire tree; not just current directory (implies --full-name)
    --abbrev[=<n>]        use <n> digits to display object names

2022-06-16T19:11:44-0400 [WARNING ] backups2datalad: Command `git ls-tree -r '--format=%(objecttype):%(objectsize):%(path)' -z HEAD` [cwd=/mnt/backup/dandi/dandisets/000004] exited with return code 129
2022-06-16T19:11:44-0400 [WARNING ] backups2datalad: Command `git ls-tree -r '--format=%(objecttype):%(objectsize):%(path)' -z HEAD` [cwd=/mnt/backup/dandi/dandisets/000003] exited with return code 129
2022-06-16T19:11:44-0400 [WARNING ] backups2datalad: Command `git ls-tree -r '--format=%(objecttype):%(objectsize):%(path)' -z HEAD` [cwd=/mnt/backup/dandi/dandisets/000005] exited with return code 129
2022-06-16T19:11:44-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000004/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 204, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 127, in update_dandiset
    return await self.set_dandiset_gh_metadata(dandiset, ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 238, in set_dandiset_gh_metadata
    stats, zarrstats = await self.get_dandiset_stats(ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 213, in get_dandiset_stats
    for filestat in await ds.get_file_stats():
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/adataset.py", line 175, in get_file_stats
    async for line in p:
  File "/home/dandi/miniconda3/envs/dandisets/lib/python3.8/site-packages/anyio/abc/_streams.py", line 31, in __anext__
    return await self.receive()
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 78, in receive
    raise RuntimeError(
RuntimeError: `git-annex find '--include=*' --json` [cwd=/mnt/backup/dandi/dandisets/000004] command suddenly exited with return code 0!
2022-06-16T19:11:44-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000003/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 204, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 127, in update_dandiset
    return await self.set_dandiset_gh_metadata(dandiset, ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 238, in set_dandiset_gh_metadata
    stats, zarrstats = await self.get_dandiset_stats(ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 213, in get_dandiset_stats
    for filestat in await ds.get_file_stats():
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/adataset.py", line 178, in get_file_stats
    filedict[path] = replace(filedict[path], size=int(data["bytesize"]))
KeyError: 'sub-YutaMouse20/sub-YutaMouse20_ses-YutaMouse20-140321_behavior+ecephys.nwb'
2022-06-16T19:11:44-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 000005/draft>:
Traceback (most recent call last):
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/aioutil.py", line 204, in dowork
    outp = await func(inp)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 127, in update_dandiset
    return await self.set_dandiset_gh_metadata(dandiset, ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 238, in set_dandiset_gh_metadata
    stats, zarrstats = await self.get_dandiset_stats(ds)
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/datasetter.py", line 213, in get_dandiset_stats
    for filestat in await ds.get_file_stats():
  File "/mnt/backup/dandi/dandisets/tools/backups2datalad/adataset.py", line 178, in get_file_stats
    filedict[path] = replace(filedict[path], size=int(data["bytesize"]))
KeyError: 'sub-anm184389/sub-anm184389_ses-20130207_behavior+ecephys.nwb'
^C

I have disabled the cron job. Please fix and test on drogon - there is a screen session

yarikoptic commented 2 years ago

may be also relates to this crash I got from a separate run for 000108 (screen's window 3) after it created some zarrs ;

configure-sibling(ok): . (sibling)
action summary:
  configure-sibling (ok: 1)
  create_sibling_github (ok: 1)
fatal: Unable to create '/mnt/backup/dandi/dandisets/000108/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
2022-06-16T19:12:19-0400 [ERROR   ] backups2datalad: Dandiset 000108: sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-LEC_run-1_chunk-4_SPIM.json: download failed: git-annex: user error (xargs ["-0","git","--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","add","-f","--"] exited 123)
fatal: Unable to create '/mnt/backup/dandi/dandisets/000108/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
fatal: Unable to create '/mnt/backup/dandi/dandisets/000108/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
2022-06-16T19:12:19-0400 [ERROR   ] backups2datalad: Dandiset 000108: sub-MITU01/ses-20220311h10m19s37/micr/sub-MITU01_ses-20220311h10m19s37_sample-22_stain-NN_run-1_chunk-7_SPIM.json: download failed: git-annex: user error (xargs ["-0","git","--git-dir=.git","--work-tree=.","--literal-pathspecs","-c","filter.annex.smudge=","-c","filter.annex.clean=","-c","filter.annex.process=","add","-f","--"] exited 123)
fatal: Unable to create '/mnt/backup/dandi/dandisets/000108/.git/index.lock': File exists.

Another git process seems to be running in this repository, e.g.
an editor opened by 'git commit'. Please make sure all processes
are terminated then try again. If it still fails, a git process
may have crashed in this repository earlier:
remove the file manually to continue.
jwodder commented 2 years ago

@yarikoptic Regarding the first error, it appears that the --format option to git-ls-tree was only added to Git very recently, in version 2.36.0, and the Git on drogon is not sufficiently up to date.

jwodder commented 2 years ago

@yarikoptic The second crash is definitely unrelated, but I can't figure out what would have caused it.

yarikoptic commented 2 years ago

let's consider this one addressed