gdcc / pyDataverse

Python module for Dataverse Software (dataverse.org).
http://pydataverse.readthedocs.io/
MIT License
64 stars 45 forks source link

Add ability to delete a published dataset ("destroy") #21

Closed pdurbin closed 5 years ago

pdurbin commented 5 years ago

http://guides.dataverse.org/en/4.15/api/native-api.html#delete-published-dataset describes a "destroy" API that allows superusers to delete datasets even after they are published. I can think of a couple use cases for this.

Here are the curl command examples from the API Guide link above:

Destroy by Persistent ID (PID):

curl -H "X-Dataverse-key:$API_TOKEN" -X DELETE http://$SERVER/api/datasets/:persistentId/destroy/?persistentId=doi:10.5072/FK2/AAA000

Destroy by dataset ID:

curl -H "X-Dataverse-key:$API_TOKEN" -X DELETE http://$SERVER/api/datasets/999/destroy

I'm happy to make a pull request if you'd like. Please let me know.

skasberger commented 5 years ago

This would be an easy first PR to pyDataverse, what do you think @pdurbin ?

Just take the upload_file() function in api.py, copy it and change the query_str and related stuff. :)

pdurbin commented 5 years ago

@skasberger sure, I'll try to make a pull request. Please go ahead an assign this issue to me if you like.

Instead of "upload file" I planned on using "delete dataset" as a starting point:

https://github.com/AUSSDA/pyDataverse/blob/v0.2.1/src/pyDataverse/api.py#L744

pdurbin commented 5 years ago

@skasberger huh. I'm surprised by the Dataverse API behavior. I tried using the pyDataverse "delete dataset" method with a superuser API token on a published dataset and to my surprise the dataset was destroyed!

Here's the code in question: https://github.com/IQSS/dataverse/blob/v4.15/src/main/java/edu/harvard/iq/dataverse/api/Datasets.java#L296

I guess this means that I don't really need a "destroy dataset" method in pyDataverse. I can just use the regular "delete dataset" method as a superuser.

@sekmiller seems to agree with me that the behavior I'm seeing is probably a bug in the Dataverse API. When I use "delete dataset" as a superuser on a published dataset, we think the API should return an error like "Are you sure you want to destroy a published dataset? If so, use the dedicated 'destroy' API endpoint for this!"

The bottom line is that I don't plan to make a pull request for this issue after all. Not anytime soon anyway. Again, I don't need "destroy dataset", surprisingly. 😄

skasberger commented 5 years ago

So, i can close the issue, right? Will there be any updates on the API side, who will fix this in the nearer future?

pdurbin commented 5 years ago

I'll just close this issue. Again, I don't need any changes in pyDataverse. Most people will probably just try "delete dataset" rather than hunting around for "destroy dataset" like I did.

I'm reluctant to change any APIs in Dataverse itself. Maybe this is because I watched "Spec-ulation" by Rich Hickey: https://www.youtube.com/watch?v=oyLBGkS5ICk . Transcript: https://github.com/matthiasn/talk-transcripts/blob/master/Hickey_Rich/Spec_ulation.md

"Now what happens if you just put foo2 next to foo? You can still tell people. They can say, "I am in Bermuda this week, but next week I will try foo2. That sounds awesome." But right now, my web service is going to keep working, because it calls foo, and you did not take it away from me while I was on vacation."