dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 12 forks source link

Unembargo process results in error due to incomplete uploads #2021

Open kabilar opened 2 months ago

kabilar commented 2 months ago

As reported by a user on August 27 via email to help@dandiarchive.org, when umembargoing a Dandiset an error occurs which turns out to be a result of incomplete uploads. The error message displayed to the user upon pressing UNEMBARGO is:

Sorry, something went wrong on our side (the developers have been notified). Please try that operation again later.

@jjnesbitt found that the API was returning a 400 error when attempting to unembargo. To resolve, Jake manually removed the uploads after confirmation from the user that the uploads were not necessary. We should catch this case of incomplete/interrupted uploads, provide this feedback directly to the user, and provide instructions on how to resolve.

Thank you.

alessandratrapani commented 1 week ago

Hey there! I have 2 embargoed dandisets (001172 and 001170) where the GUI Unembargo button is inactive, but I don't have any active uploads. I don't know exactly where to check for incomplete/interrupted uploads. Can you please point me in the right direction?

jjnesbitt commented 1 week ago

Hi Alessandra, we don't currently have a way for end users to easily check/clear incomplete uploads on a dandiset (although we are trying to improve this). I've checked and can confirm I see the following in progress / incomplete uploads:

  1. Dandiset 001172: 8 incomplete uploads, between ~35gb and ~50gb in size.
  2. Dandiset 001170: 20 incomplete uploads, between ~20gb and ~60gb in size.

If you'd like, I could clear these uploads, in order for you to proceed. If there is some missing data, you could either hold off on un-embargo until you've finished uploading the data, or un-embargo right away, and finish uploading afterwards. There would be no functional difference.

Please let me know what you'd like to do.

alessandratrapani commented 1 week ago

Yes, please proceed with clearing those uploads. I will discuss the next steps with the authors. Do you have any suggestions on the uploading procedure to avoid incomplete uploads?

jjnesbitt commented 6 days ago

Yes, please proceed with clearing those uploads. I will discuss the next steps with the authors.

Okay, this has been done. I can confirm that both of those dandisets are now available for un-embargo.

Do you have any suggestions on the uploading procedure to avoid incomplete uploads?

At the moment, not particularly. My guess is that its due to the process running the CLI being killed during upload, which leaves those uploads orphaned. So my only advice would be to gracefully shutdown the CLI if an upload is in progress (if that's the case here). @yarikoptic any advice on this front?

bendichter commented 6 days ago

What do you mean by gracefully shutting down the CLI?

jjnesbitt commented 6 days ago

What do you mean by gracefully shutting down the CLI?

This is where my lack of knowledge about the CLI comes into play, but I'm assuming that when a user cancels a running upload, the CLI attempts to handle that in a way that doesn't leave orphaned uploads. So if that process was hard killed, it wouldn't have a chance to do so. However, afaik we don't have endpoints to delete uploads in any way, so the only way the CLI might do that is finish any uploads in progress.

bendichter commented 6 days ago

yes, though users will tend to run into this problem if they have slow or unreliable upload speeds and may not always be able to let an upload complete.

@jwodder, @yarikoptic do you think a reasonable solution to this might be to add a CLI function that clears any existing unfinished uploads?

jjnesbitt commented 5 days ago

That function would still need API support, which I think is something we should provide anyway for the web client.

bendichter commented 5 days ago

Sounds good to me

yarikoptic commented 5 days ago

... the only way the CLI might do that is finish any uploads in progress.

neat idea, filed

But overall it would not be a solution: we do need to improve API for the life cycle of uploads - both for regular assets or zarrs. Related existing issues for zarrs

if here an issue is about regular assets (I assume, didn't tripple check) -- IMHO we should

With that any UI/client can implement querying/canceling of uploads, and then "Unembargo" button could check/query for confirmation/clean things up to proceed to /unembargo/ functionality. A complimentary alternative is to make /unembargo/ to add a bool flag cleanup_uploads which would, if True, first automatically clean up all such incomplete uploads, while initial request error would provide information about such uploads. But I like that approach less.

jjnesbitt commented 5 days ago

This problem will largely be addressed by the upcoming garbage collection, but I agree we should add that functionality to the API. Namely, listing and cancelling of existing uploads.