SORMAS-Foundation / SORMAS-Project

SORMAS (Surveillance, Outbreak Response Management and Analysis System) is an early warning and management system to fight the spread of infectious diseases.
https://sormas.org
GNU General Public License v3.0
291 stars 136 forks source link

Automatic deletion of personal data #7736

Open StefanKock opened 2 years ago

StefanKock commented 2 years ago

Situation Description & Motivation

Data collected for cases, contacts and the other core entities has to be deleted automatically after a given time period.

Use cases

High-Level Explanation

We need to define and implement the automatic deletion of person-related data within SORMAS.

Timeline

Core functionality should be done end of Q1 2022.

Tasks

Concept Phase

Automatic deletion of entities

Note: There has been a change to the way this works. The first version of automatic deletion just does a soft-delete and then relies on mechanics that permantly delete soft-deleted data. Instead of this the automatic should now permanantly delete data, so manually deleted data can be handled separately.

Manual deletion of entities

Permanent deletion of entities

Automatic archiving

Out of scope

Field-wise deletion

Needed Refinements

Risks

Additional Information

marko-arn commented 2 years ago

I would like to point out that the handling should be different upon the case classification. A "No Case" might be earlier deleted then a confirmed case.

For the travel entries please also include the contacts which are only a travel entry and not a contact to a case or converted into a case.

Please keep also contacts which results in a case anyway.

MartinWahnschaffe commented 2 years ago

@Marko-ilmkreis Thanks for the input on "no case".

Could you give some more details on the travel entries + contacts point, please? Are you saying that you are creating a contact for a person that does have a travel entry, so the contact is created without an associated case and has the "returningTraveler" info set to true? Thus the contact would fall under the 14 day deletion period?

Would be good to have some details on this, so we can approach the data protection team.

marko-arn commented 2 years ago

@MartinWahnschaffe Yes, as the import of the DEA travel entries is not really possible before 1.67 we use the contacts and set the returningTraveler to true. Thus contacts must fall under the deletion after 14 days if they are not also a contact to a case, a ep or if they are not converted into a case. From 1.67 we hopefully will use the normal travel entries.

MartinWahnschaffe commented 2 years ago

Ok. I guess we will need to provide the means to manually delete those contacts in bulk, based on filters in the contact directory. The logic behind it will be a bit too specific to this workarround to extend the automatic deletion based on it.

marko-arn commented 2 years ago

The filter for travel returns is still there, so a filter for "not resulting in a case" and "not an ep" is missing and a filter by the date the entity was latest edited.

May be (offtopic) if would be easier to have a way to convert all "travel-entry-contacts" into travel entries and remove the button for travel return from contacts if the travel entries directory is enabled.

MartinWahnschaffe commented 2 years ago

I unfortunately didn't define a clear path on how to deal with campaigns. They are a core entity and can be manually deleted and as such they now also support automatic deletion, although this was not specified. In addition the permanent deletion of campaigns is not fully implemented, so it only works for campaigns that don't have campaign data yet.

There are now two ways forward:

  1. Explicitly exclude campaigns form automatic and permanent deletion.
  2. Extend the permanent deletion of campaigns to also include campaign data and extend the automatic deletion date checks to also include the dates of campaign data.

Solution 2 would be cleaner and more powerful, but comes with the drawback that deleting a whole campaign including all data may be a very big thing (comparable to deleting the aggregated reports of a whole country) that should not happen automatically.

Note: Campaigns are not used in Germany and thus aren't relevant for the related data security endeavor.

@Candice-Louw @Jan-Boehme Thoughts on this?

Candice-Louw commented 1 year ago

@SORMAS-JanBoehme - it may make sense to go with the option 1. Explicitly exclude campaigns form automatic and permanent deletion.

Reasoning: Campaigns only capture aggregate data, no personal/sensitive data (@MartinWahnschaffe - please correct this if I'm mistaken?) and as mentioned are not used in Germany. It therefore makes sense to exclude campaigns from the automatic and permanent deletion consideration based on these grounds.

MartinWahnschaffe commented 1 year ago

@Candice-Louw Yes and no. The intention of campaigns is to capture aggregate data. Campaign forms have text fields, though, so we can't say for sure whether it is miss-used for collecting personal/sensitive data.

SORMAS-JanBoehme commented 1 year ago

@Candice-Louw @MartinWahnschaffe I also think that automatically deleting the whole campaign and relevant data for analytics and decision making just to be sure that no one has entered personal / sensitive data in any text fields is a bit much.

The middle ground in my opinion would be to exclude campaigns that have data associated with them from the automatic deletion and add the standard mouse over info to text fields in which the users are informed about the GDPR agreement and that they must not enter any sensitive data in these fields like we have it in other forms