IQSS / dataverse.harvard.edu

Custom code for dataverse.harvard.edu and an issue tracker for the IQSS Dataverse team's operational work, for better tracking on https://github.com/orgs/IQSS/projects/34
5 stars 1 forks source link

Ongoing DevOps/Prod. support tasks. Oct. 10 - 23 (?) #310

Closed landreev closed 3 weeks ago

landreev commented 4 weeks ago

There was a time-consuming support issue (RT #370734), with users who had published a file with some restricted data by mistake; they were not satisfied with the fact that the file stayed on our servers even once deaccessioned. Documenting the process of purging the file from storage in the "tips and tricks" document. As a side note, maybe we should consider adding a straightforward "destroy" call for a published file, since the scenario above may be fairly common/commonsense. It may be something superuser-only, but still readily-available.

landreev commented 4 weeks ago

The other somewhat time-consuming prod. task this week was diagnosing the minor DataCite bug on Tue. But that was handled under the "Upgrade to 6.4" issue, for accounting purposes.

cmbz commented 4 weeks ago

@landreev I very much support a destroy dataset and/or destroy file API endpoint for super-users. There are other use cases for the functionality, too, such as time-limited licensed data.

landreev commented 4 weeks ago

@cmbz We have a way to destroy an entire dataset - but it's kind of a brute force, if not nuclear option. In this particular case, the researchers would have to sacrifice a dataset that's been around for a while (and, possibly, cited or referenced elsewhere by its DOI), simply because one of them uploaded and published a wrong file in a later version by mistake.

There is no easy way to destroy a deaccessioned version; and there is no easy way to destroy a published file - and yes, it really looks like it would be useful to have!

qqmyers commented 4 weeks ago

FWIW: The retention period functionality from DANS/Paul Boon is intended for this purpose. It explicitly did not deal with physical file deletion though, since there was concern about automating that/always physically deleting at the end of a retention period. So - mostly agreeing that a delete published file endpoint would be useful.

landreev commented 3 weeks ago

We also spent some time with Rei (LTS) walking us through the anti-bot rules on the Harvard ALB. In the process we got access to the APIs unblocked for a few large and medium-size European countries (access to the site was for all practical purposes limited to browser-only for a few country-wide regions).

Closing, will open a new one.