datadryad / dryad-product-roadmap

Repository of issues for Dryad project boards
https://github.com/orgs/datadryad/projects
8 stars 0 forks source link

Dataset turns "published" unrealistically quickly for large submission #394

Closed sfisher closed 1 year ago

sfisher commented 5 years ago

I'm not clear if this is a problem we'll see in Dryad, but this happens in the latest version of Dash now and the code is probably similar, so it may also be a problem with large file uploads.

How to reproduce:

  1. make a large file like 9 GB. head -c $[1024*1000*9000] < /dev/urandom > LargeTestFile.img

  2. Upload as a standard submission (not a manifest). This takes a long time to upload.

  3. Submit the dataset.

In Dash we are seeing the status going to 'published' within a few minutes and it unlocks the UI for uploading another version at that point. It shouldn't unlock until it has gone all the way through Merritt because it will likely cause Merritt errors.

I got an email that my item was ingested into Merritt a couple of hours later even though it was supposedly already done!

WTF is going on here? Or maybe the OAI-PMH feed is exposing the dataset before it really has gone through Merritt?

ryscher commented 1 year ago

Fixed long ago.