dandi / backups2datalad

Mirror Dandisets as git-annex repositories
MIT License
1 stars 0 forks source link

Fresh update errored out #59

Closed yarikoptic closed 2 weeks ago

yarikoptic commented 2 months ago

Just got from cron job

2024-08-12T10:02:13-0400 [WARNING ] dandi: A newer version (0.63.0) of dandi/dandi-cli is available. You are using 0.61.2                                                           
2024-08-12T10:02:54-0400 [ERROR   ] backups2datalad: Dandiset 001089: README: download failed: CHECKURL failed with no reason given                                                 
2024-08-12T10:02:55-0400 [ERROR   ] backups2datalad: Dandiset 001089: dataset_description.json: download failed: CHECKURL failed with no reason given                               
addurl: 2 failed                                                                                                                                                                    
2024-08-12T10:02:55-0400 [ERROR   ] backups2datalad: Job failed on input <Dandiset 001089/draft>:                                                                                   
Traceback (most recent call last):                                                                                                                                                  
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/aioutil.py", line 177, in dowork                                                       
    outp = await func(inp)                                                                                                                                                          
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 192, in update_dandiset                                           
    changed = await self.sync_dataset(dandiset, ds, dmanager)                                                                                                                       
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 244, in sync_dataset                                              
    await syncer.sync_assets()                                                                                                                                                      
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/syncer.py", line 87, in sync_assets                                                    
    report.check()                                                                                                                                                                  
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/asyncer.py", line 100, in check                                                        
    raise RuntimeError(                                                                                                                                                             
RuntimeError: Errors occurred while downloading: 2 assets failed to download                                                                                                        
2024-08-12T10:02:55-0400 [ERROR   ] backups2datalad: An error occurred:                                                                                                             
Traceback (most recent call last):                                                                                                                                                  
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/__main__.py", line 119, in wrapped                                                     
    await f(datasetter, *args, **kwargs)                                                                                                                                            
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/__main__.py", line 229, in update_from_backup                                          
    await datasetter.update_from_backup(dandisets, exclude=exclude)                                                                                                                 
  File "/home/dandi/miniconda3/envs/dandisets-2/lib/python3.10/site-packages/backups2datalad/datasetter.py", line 135, in update_from_backup                                        
    raise RuntimeError(                                                                                                                                                             
RuntimeError: Backups for 1 Dandiset failed                                                                                                                                         
Logs saved to /mnt/backup/dandi/dandisets/.git/dandi/backups2datalad/2024.08.12.14.02.12Z.log                                                                                       
action summary:                                                                                                                                                                     
  publish (notneeded: 2)      
jwodder commented 2 months ago

@yarikoptic This is because https://github.com/dandi/dandi-archive/issues/1996 is still unresolved.

yarikoptic commented 2 months ago

ah. So we have somewhere on datalad side some insufficient communication back to git-annex so instead of CHECKURL failed with no reason given we could get something like CHECKURL failed because of needing authentication or alike. Let's wait for a day or two on a possible resolution on dandi-archive side.

Just to make sure -- we have no easy way ATM to skip embargoed dandisets generally right? and in general not all need to be skipped really since it is only in the case of text files which we are trying to add directly to git (instead of git-annex) we do try to get their content and then fail this way. Right?

jwodder commented 2 months ago

@yarikoptic

we have no easy way ATM to skip embargoed dandisets generally right?

Correct.

and in general not all need to be skipped really since it is only in the case of text files which we are trying to add directly to git (instead of git-annex) we do try to get their content and then fail this way. Right?

Yes, though this problem only seems to occur for 001089, so we could just exclude that Dandiset for now.

yarikoptic commented 2 weeks ago

001089 did fine with --verify now, so closing

(dandisets-2) dandi@drogon:/mnt/backup/dandi/dandisets$ duct flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron-nonzarr.lock bash -c '/mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron --mode verify 001089'
duct is executing flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron-nonzarr.lock bash -c /mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron --mode verify 001089...
Log files will be written to .duct/logs/2024.10.17T14.30.14-858882_
Logs saved to /mnt/backup/dandi/dandisets/.git/dandi/backups2datalad/2024.10.17.18.30.24Z.log
publish(ok): . (dataset) [refs/heads/draft->github:refs/heads/draft f035485..39d6098]
action summary:
  publish (notneeded: 1, ok: 1)
Already up to date.
publish(ok): . (dataset) [refs/heads/draft->github:refs/heads/draft d343483..00035ec]
action summary:
  publish (notneeded: 1, ok: 1)
Exit Code: 0
Command: flock -E 0 -e -n /home/dandi/.run/backup2datalad-cron-nonzarr.lock bash -c /mnt/backup/dandi/dandisets/tools/backups2datalad-update-cron --mode verify 001089
Log files location: .duct/logs/2024.10.17T14.30.14-858882_
Wall Clock Time: 43.454 sec sec
Memory Peak Usage (RSS): 46072 KiB
Memory Average Usage (RSS): 116366.952 KiB
Virtual Memory Peak Usage (VSZ): 61372 KiB
Virtual Memory Average Usage (VSZ): 205054411.810 KiB
Memory Peak Percentage: 0.0%
Memory Average Percentage: 0.067%
CPU Peak Usage: 133.3%
Average CPU Usage: 58.219%
Samples Collected: 42
Reports Written: 1

> '[' -s .git/tmp/stderr-2 ']'
> echo 'There was stderr from backup/datalad invocations:'
> cat .git/tmp/stderr-2
There was stderr from backup/datalad invocations:
> backups2datalad -l WARNING --backup-root /mnt/backup/dandi --config tools/backups2datalad.cfg.yaml update-from-backup --workers 5 -e '000(026|108|243)$' --mode verify 001089
> grep -v 'nothing to save, working tree clean'
> git pull --commit --no-edit
> datalad push -J 5