NBISweden / aida-data-hub

AIDA Data Hub Scrum team board
1 stars 1 forks source link

Support use case of removing data (e.g. because of withdrawn usage permission) #130

Open pontus opened 1 year ago

pontus commented 1 year ago

A dataset can be updated to not contain data. We need to support this use case.

In dataset updates, the dataset will change in the archive, and the old version will no longer be accessible through the archive.

This could (but does not have to) mean having a process to regularly go through current version of datasets and remove any stored data that is not referenced.

Any such data removed must also be removed from backups or extra storage spaces.

There are probably (at least) two cases that will need different handling:

  1. The contributor uploads a new version of a dataset with new scientific content. Preferably, this should be handled (or communicated to?) wp4 services in such a way that running service instances do not break.
  2. The contributor uploads a new version for GDPR reasons. These changes must take effect immediately, and anybody who accessed the old version must be notified of their legal requirement to fix the legal status of the data in their control.
yohell commented 1 year ago

I think this issue may be more suited for giantsloth. Fundamentally, it's up to wp3 to handle deletion decisions, but downstream of that, the deletion should be handled by the archive, and propagated to user facing services and users. The MVP could be that we communicate to wp4 that "sometimes the files in the archive will change or get deleted, and your services must handle this (possibly for GDPR reasons). How will you want events like this to be handled? By us communicating to affected users of your service that they must handle the change themselves immediately or have their instance shut off?".