SORMAS-Foundation / SORMAS-Project

SORMAS (Surveillance, Outbreak Response Management and Analysis System) is an early warning and management system to fight the spread of infectious diseases.
https://sormas.org
GNU General Public License v3.0
293 stars 143 forks source link

Implement deletion concept for case information SORMAS #2100

Open markusmann-vg opened 4 years ago

markusmann-vg commented 4 years ago

Situation Description

By German law we are responsible for deleting data we store in SORMAS.

Feature Description

For cases

What needs to be deleted?

What will not be deleted?

When is deletion to take place? At the latest on 1.1. of the 10th year following the last relevant processing of the data set in question by the competent public health authority

Possible Alternatives

Additional Information

Deletion does not mean actual deletion but anonymization will take place - "Deleted" will be shown to the users

Case information deletion depends on the latest case, contact and event particpant of the person

markusmann-vg commented 4 years ago

If one of the relevant fields (like health facility) that need to be deleted are mandatory, then those have to be replaced by ("other health facility")

manually crop the zip-code (38103 => 381..)

MartinWahnschaffe commented 4 years ago

@max-hzi @fhauptmann Let's say we have a person that has two cases. One of those cases is older than the 10 years, the other is younger. I guess that would mean, that we still keep first name, etc. of the person and also the health facility of both cases. Deletion takes only place when the person and all it's cases are older than 10 years?

Maybe a question that needs to be forwarded to Gérard and/or Gaby.

max-hzi commented 4 years ago

I'd like to forward some decisions made by Gaby and Gérard:

max-hzi commented 4 years ago

Once again I'd like to document some info from Gérard:

The health departments should be able to set a few essential configurations of the deletion periods for personal identifying data.

StefanKock commented 3 years ago
bernardsilenou commented 3 years ago

@StefanKock My suggestions:

  • What happens with a person when it is linked to several cases/contacts?

If the other cases, contact, or event participants are not suppose to be deleted, We keep the person in db but delete the person uuid, id or any other id linking the person with any of the cases that meet the criteria for pseudonymization or deletion in the case table.

This would mean the db need to be able to save a case, contact, event participant without being referenced by a person ?

  • What is about really deleting the whole entities (as requested for France)?

We should not do this of the law does not insist on this. Keeping the pseudonymized data is beneficial in terms of future research. We may need a parameter on server to configure this

ChristopherRiedel commented 3 years ago

@bernardsilenou

  • What is about really deleting the whole entities (as requested for France)?

We should not do this of the law does not insist on this. Keeping the pseudonymized data is beneficial in terms of future research. We may need a parameter on server to configure this

Unfortunately, exactly this is required for France (after one year at the latest -> see #3499)

bernardsilenou commented 3 years ago

Since the deletion concept from France requires that we delete the entity itself after a year, we may need a configuration for this. I got informed today that It is urgently needed that we keep to the plan and implement this ticket and #3499 in sprint 99.

@StefanKock @markusmann-vg Is there any further refinement that should be done from the HZI side? @ftavin @carolinverset

MartinWahnschaffe commented 3 years ago

Part of the information is still present in the audit logs and history tables. We need to decided how to deal with that as-well.

The general deletion period for logs seems to be 2 months. I'm not sure though, whether we can really delete the information that a case was edited more than 2 months ago, or just the details what has been edited.

Also we need to define when to vacuum parts of the database to really get rid of deleted data.