Closed jwodder closed 2 years ago
Merging #258 (966494a) into draft (da50a74) will decrease coverage by
0.11%
. The diff coverage is87.80%
.
@@ Coverage Diff @@
## draft #258 +/- ##
==========================================
- Coverage 77.73% 77.61% -0.12%
==========================================
Files 14 14
Lines 2021 2046 +25
Branches 343 346 +3
==========================================
+ Hits 1571 1588 +17
- Misses 318 326 +8
Partials 132 132
Impacted Files | Coverage Δ | |
---|---|---|
tools/backups2datalad/aioutil.py | 82.35% <ø> (ø) |
|
tools/backups2datalad/syncer.py | 78.46% <40.00%> (-3.80%) |
:arrow_down: |
tools/backups2datalad/util.py | 78.51% <80.00%> (-1.19%) |
:arrow_down: |
tools/backups2datalad/datasetter.py | 77.35% <85.71%> (+0.40%) |
:arrow_up: |
tools/backups2datalad/adataset.py | 79.24% <100.00%> (-1.57%) |
:arrow_down: |
tools/backups2datalad/asyncer.py | 79.71% <100.00%> (+0.16%) |
:arrow_up: |
tools/backups2datalad/zarr.py | 78.42% <100.00%> (+0.56%) |
:arrow_up: |
... and 2 more |
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@jwodder I think this PR is at large "ready" but there is two issues remaining which are picked up by the tests
test_backup_command
--
Z > assert repo.get_commitish_hash("HEAD") == last_commit
2022-09-08T01:48:11.9633860Z E AssertionError: assert 'ac1a2c07f6af...f22713713acb1' == '47d79ea5d5d1...998fc3b1f3303'
2022-09-08T01:48:11.9634286Z E - 47d79ea5d5d119fa654cb51fd8e998fc3b1f3303
2022-09-08T01:48:11.9634567Z E + ac1a2c07f6af64904c230f2001af22713713acb1
2022-09-08T01:48:11.9634727Z
2022-09-08T01:48:11.9634878Z test_backups2datalad/test_commands.py:82: AssertionError
where it seems repo gets 1 more commit like
(git-annex)lena:~/.tmp/pytest-of-yoh/pytest-189/test_backup_command0/ds/203955[draft]git-annex
$> git show
commit 900fe057d2a9085e19b9586c5817dc07387767ff (HEAD -> draft)
Author: DANDI User <info@dandiarchive.org>
Date: Thu Sep 8 01:30:40 2022 +0000
[backups2datalad] Only some metadata updates
diff --git a/.dandi/assets-state.json b/.dandi/assets-state.json index 9b6b320..8532fc8 100644 --- a/.dandi/assets-state.json +++ b/.dandi/assets-state.json @@ -1,3 +1,3 @@ {
"timestamp": "2022-09-08T01:30:30.222889+00:00"
"timestamp": "2022-09-08T01:30:40.189119+00:00" }
test_backup_zarr_delete_zarr
where it fails since seems to garbage collect that sample.zarr
ideas on what could lead to them or better even fixes??
edit: also please review my changes @jwodder
@yarikoptic For the first problem, this seems to be caused by the same phenomenon described in the comment above:
Thus, when state.timestamp < d.version.modified
, the assert repo.get_commitish_hash("HEAD") == last_commit
test should instead check that the parent commit of the HEAD
equals last_commit
.
I'm looking into the second problem.
@yarikoptic Preliminary review comment: You didn't enable pre-commit, so your code wasn't formatted with black.
@yarikoptic For test_backup_zarr_delete_zarr
, the deleted Zarr is getting garbage collected because the function for finding local assets doesn't pick up uninstalled subdatasets, which leads to the Zarr not being removed from the backup in response to the Zarr being deleted from the Dandiset on the server, so the asset garbage collection gets it instead. One option for handling this would be to add all subdataset paths as assets in dataset_files()
.
@yarikoptic For the first problem, this seems to be caused by the same phenomenon described in the comment above:
thanks! confirming -- this test indeed fails even on draft
branch -- did fail 2 times out of 10 runs in for s in {1..10}; do echo $s; .nox/test/bin/python -m pytest --no-cov -v --show-capture=no -k test_backup_command; done
. It did fail 4 times on this branch, which is higher "hit rate" but that is ok, the point is that behavior is indeed such that we can get the timestamp updated and test should have reflected that! I will fix it for that.
woohoo -- we are green! I have took this PR out of draft, disabled cron job, and will test it merged into draft on sample dandisets to see if any side-effects or smth . @jwodder please have a look at the latest commits or overall diff if all is good now.
Closes #256.
TODO:
AsyncDataset.get_stats()
needs to be updated to no longer traverse the subdatasets within the superdataset; instead, it needs to get the Zarr ID by parsing the URL from.gitmodules
and then traverse the Zarr dataset with that ID inzarr_root
. This will involved passing theManager
orBackupConfig
to the method.SampleDandiset.check_backup()
and related methods inconftest.py
need to check that the Zarr subdatasets are present but not cloned in the Dandiset datasets.datalad foreach-dataset --jobs 10 -s --cmd-type eval -R 1 --chpwd pwd 'ds.uninstall(check=False, recursive=False)'
)