Closed yarikoptic closed 7 months ago
@yarikoptic My investigations found the following Archive bugs which are probably to blame:
@yarikoptic The Archive bugs appear to be fixed now. Are you still having any problems deleting?
it seems to look consistent and matches the number now:
❯ dandi delete dandi://dandi/000876/code/venvs/
Delete 28259 assets on server from Dandiset 000876? [y/N]: N
2024-04-01 12:31:50,822 [ INFO] Logs saved in /home/yoh/.local/state/dandi-cli/log/20240401160956Z-640400.log
❯ dandi delete dandi://dandi/000876/code/venvs/
Delete 28259 assets on server from Dandiset 000876? [y/N]:
I am running it now.
Summary: 28259 Deleted
but I still see https://dandiarchive.org/dandiset/000876/draft/files?location=code%2Fvenvs&page=1 in the browser ... rerunning gives me
❯ dandi delete dandi://dandi/000876/code/venvs/
Delete 4145 assets on server from Dandiset 000876? [y/N]:
so seems not working "fully". running this one again... but it seems that issue is not really resolved fully thus reopening
and after that --
❯ dandi delete dandi://dandi/000876/code/venvs/
Delete 636 assets on server from Dandiset 000876? [y/N]:
FTR the prior log was /home/yoh/.local/state/dandi-cli/log/20240401183902Z-667577.log
❯ grep 'Response: ' /home/yoh/.local/state/dandi-cli/log/20240401183902Z-667577.log | sed -e 's,.*:,,g' | sort | uniq -c
44 200
4145 204
so we got 4145 204 responses... and in previous
❯ grep 'Response: ' /home/yoh/.local/state/dandi-cli/log/20240401163152Z-648330.log | sed -e 's,.*:,,g' | sort | uniq -c
285 200
28259 204
may be not all just were removed on server side ?? I thought I would find duplicate DELETE requests between two runs but no:
❯ grep 'DELETE https://api.dandiarchive.org/api/dandisets/000876/versions/draft/' /home/yoh/.local/state/dandi-cli/log/20240401163152Z-648330.log /home/yoh/.local/state/dandi-cli/log/20240401183902Z-667577.log | sed -e 's,.*DELETE,,g' | sort | uniq -c | sort -n | nl | tail -n 3
32402 1 https://api.dandiarchive.org/api/dandisets/000876/versions/draft/assets/fff6ae06-d8b1-4fb2-88c2-91231583fec5/
32403 1 https://api.dandiarchive.org/api/dandisets/000876/versions/draft/assets/fffc0e83-b5ec-43f3-9851-c8ce83f18376/
32404 1 https://api.dandiarchive.org/api/dandisets/000876/versions/draft/assets/ffff4674-212c-4e41-87ef-497e830fd0d0/
so it seems that we have unique asset ids between those two runs... odd. needs more troubleshooting -- I am not running another delete
so you could try... let me know if you think those log files would come handy
@yarikoptic If you run the dandi delete
command now but don't confirm the deletion, how many assets does the command say it'll delete? I'm getting 7089, which appears to be the correct amount.
yikes, it grew indeed:
❯ dandi delete dandi://dandi/000876/code/venvs/
Delete 7089 assets on server from Dandiset 000876? [y/N]:
@TheChymera is your upload process still running by some chance?
@jwodder might be worth checking dandi-api logs under /mnt/backup/dandi/heroku-logs/dandi-api
to possibly gain more insight.
reminder: I will be traveling, please figure it out - I believe if @TheChymera adds you @jwodder as co-owner on the dandiset, you should be able to perform those actions too!
@yarikoptic yes, I started it again to see if I can reproduce the error, but if finished now, and yes, I get the same error.
@yarikoptic I see logs of assets still being uploaded to 000876 on April 1 around 16:00 EDT, when you were running dandi delete
, which seems to confirm that what happened was that @TheChymera was still uploading assets while you were deleting them.
Uff... @TheChymera - can you then now remove those assets in one try?
I meant your code/venvs, not entire dandiset
ok, deleting right now. Seems to be working thus far.
and did it work out?
ok, it seems that now lists 0 assets. let's assume resolved.
Yep, it worked.
Needed to delete errorneously uploaded
code/venvs
. Got alarmed first when number of assets reported bydelete
was smaller than the ones I get fromdandi ls
:so we got 22138 offered to be removed out of 28262? Then I deleted a folder with 3 for a test:
and upon subsequent full delete request I got larger number of assets!
There were changes in listing of assets on dandi-archive side, but I would not expect
delete
differ fromls
where afaik they likely use the same end point andls
seems to reflect number of deleted files correctly going down:Interesting the logs for
delete
seems to have someCaught exception
(without detail) whenever ls didn't:it seems that pagination is not sequential requests (or at least not necessarily arriving in the order sent), so I wonder if there is some race condition and some exception causes early exit from collation of paginated requests?
Attests to that differing number of assets it lists across invocations:
BTW that "Caught exception" is not always logged for
delete
-- another sign of some inconsistent execution seems to me.I will not delete and exclude dataladification (does not work with .gitignore!) of this dandiset for now so we have working example of problematic case to ease troubleshooting...