bixiou / oecd_climate

0 stars 0 forks source link

Anonymised data / Data Protection Rules #6

Open TobiasKruse opened 3 years ago

TobiasKruse commented 3 years ago

Following up to the email regarding the data protection rules and deleting IDs. Thanks for deleting the ID column!

We need to provide an explanation to the OECD's Data Protection Officer how we will deal with IDs in the full survey. He is concerned that through data breaches with Dynata/OECD/personal servers IDs could be recovered. Hence, we need to provide some information to him.

Once we launch the survey, the responses will automatically appear in our Qualtrics account from where we then export them, correct? By default at this stage the responses have the ID number. For the pilot you deleted this column after downloading it. One option could be that the first download of the data with the IDs is done to a secure server. We then delete the ID column there and can then use the remaining data also on 'normal' servers. We then also delete the data in the Qualtrics account as soon as we have it downloaded. Would this seem like a feasible approach?

An alternative would be if there is a way that the data in Qualtrics is uploaded without the ID. This seems more difficult because Dynata needs the information for their payment. But perhaps they do not need to give us access to that ID variable. Do you know if this is feasible? I can email Dynata to ask, but wanted to check first with you in case you have a suggestion.

bixiou commented 3 years ago

We have to save the IDs in Qualtrics for Dynata. But as soon as we download them, we remove the ID column (I have coded it in a R script). And yes, we can delete the data from Qualtrics servers as soon as the survey is over. Why would we need secure servers? My personal computer is perfectly secure. Think about it: who could be interested in the responses we have on a random 2,000 people? Perhaps people having access to Dynata data. So, people from Dynata. But then, it would be much easier for them to send an email to these people asking them directly for the information they want in a survey. There is 0 chance our data will be traced back to Dynata's data and used against privacy rules (it would need to hack both our PCs and Dynata's for a ridiculous amount of data when you can buy online the voting preferences of hundreds of millions of people, from companies like Cambridge Analytica).