sbgrid / data-capture-module

5 stars 4 forks source link

When rsync upload begins, send a message to Dataverse. #21

Closed pdurbin closed 7 years ago

pdurbin commented 7 years ago

Last week when discussing https://trello.com/c/Nbte37k1/9-rsync-file-upload-%26-download-(4.8) with @mheppler @TaniaSchlatter @dlmurphy we agreed that we'd all like to see the Data Capture Module (DCM) POST some JSON to Dataverse when upload has begun. That is to say, the DCM will recognize that the user has started executing the rsync script and inform Dataverse of this fact. When Dataverse receives the "upload has begun" message (or "uploadHasBegun"?), Dataverse will take some actions, possibly sending a notification to the user, preventing the dataset from being deleted, etc. It would be awfully nice if the DCM could send the number of bytes Dataverse should expect, but this is not a hard requirement.

I believe the issue on the Dataverse side is https://github.com/IQSS/dataverse/issues/3348

pameyer commented 7 years ago

Non-trivial change; two possible initial approaches:

First approach has intersections w\ security model (transfer user shell not a login shell; but "transfer initialization" is equivalent to receiving successful login for transfer). Second approach might require more infrastructure, but probably won't have these impacts.

pdurbin commented 7 years ago

Bummer that it's a non-trivial change. I'm not particularly interested in the rsync script talking to Dataverse. It's the rsync script's job to talk to the DCM and I think we should keep it that way. No rush on any of this. It would have been a nice to have for #3942 but we can live without it, I'd say.

pameyer commented 7 years ago

A few notes on initial investigations:

pameyer commented 7 years ago

Obsolete from discussions of https://github.com/IQSS/dataverse/issues/3348 today.

pdurbin commented 7 years ago

I still think the cron job would be better than nuthin'. 😄