dandi / dandi-archive

DANDI API server and Web app
https://dandiarchive.org
13 stars 13 forks source link

Add a mechanism to "delete" published dandisets #1123

Open mvandenburgh opened 2 years ago

mvandenburgh commented 2 years ago

This issue will require some heavy planning, including guidelines for policies/procedures around deleting published data (i.e., how dandiset owners should flag this situation for admin attention, whether they should be able to hide (and for how long) the dandiset from the public on their own authority, questions about scrubbing data permanently from the S3 bucket, etc.), and UI features to enable any/all of this.

mvandenburgh commented 2 years ago

@satra when deleting a published dandiset, do we have to actually delete the data from our database/s3, or is simply hiding it from public view sufficient? If just hiding it is good enough, an immediate idea i have is to

yarikoptic commented 2 years ago

As data from S3 is also accessible via datalad or directly from S3 using assets dumps we have

example ```shell $> s3cmd -c ~/.s3cfg-dandi-backup ls s3://dandiarchive/dandisets/000027/0.210831.2033/ 2021-08-31 20:34 2142 s3://dandiarchive/dandisets/000027/0.210831.2033/assets.jsonld 2021-08-31 20:34 2210 s3://dandiarchive/dandisets/000027/0.210831.2033/assets.yaml 2021-08-31 20:34 220 s3://dandiarchive/dandisets/000027/0.210831.2033/collection.jsonld 2021-08-31 20:34 2590 s3://dandiarchive/dandisets/000027/0.210831.2033/dandiset.jsonld 2021-08-31 20:34 2540 s3://dandiarchive/dandisets/000027/0.210831.2033/dandiset.yaml ```

and typically the requests come for data which must not be made public. We are doomed to really delete that data. That should be done with awareness that

bendichter commented 2 years ago

Yes, as a free data storage platform, we should also prepare for the case where a user uploads illicit material and tries to use DANDI to distribute it