Open Whathecode opened 9 months ago
More risky could be when DeviceRegistration would be used to connect to third-party services, such as Google Fit, to retrieve sensor data.
From this, I currently conclude that the device registration for such a device shouldn't contain the username of the account, and instead rely on a UUID
(so probably just DefaultDeviceRegistration
), and handle the linking of that id to a setup enabling authentication in the application/infrastructure layer (outside of core). Or, it could store a token instead.
Either way, the general point is that care needs to be taken when designing new DeviceRegistration
types to adhere to the data minimization principle, and reduce the risk at a minimum of direct identification of individuals for any data stored in the deployment subsystem.
But, @bardram, I believe the following conclusion in still spot on:
This may make the idea of hosting the studies and deployments subsystems by separate organizations without a legal binding contract to address these issues hard to achieve, unless some further design or infrastructure work is done. E.g., encrypting DeviceRegistration and ParticipantData.
It seems likely enough that some data stored in the deployment subsystem would be classified as PII, so stuff like GDPR kicks in. Without having the application layer fully handle encryption of said data, that would make the deployments subsystem in CARP core a data processor.
But, none of this is a concern for the current release, and this subsystem isn't deployed separately yet either way (other subsystems, like studies obviously will always have PII), so we can consider this a theoretical exercise until the point when this becomes an actual requirement. Therefore, I'll remove resolving this from the next milestone.
Overall, the deployments subsystem does a good job of not including PII in its subsystem. The link to an actual account which allows for re-identification happens in the study subsystem. A potential aim could be that a claim can be made the deployments subsystem does not store PII, and hosting can thus be outsourced to a third-party with less legally binding requirements.
However, there is data stored in the deployment subsystem which can carry PII. Concretely:
ParticipantData
). By design, the goal is for this data not to be PII (when considering all potential users using the platform). But, the researcher may request PII data, or inadvertently, a combination of requested data may make re-identification possible.MACAddressDeviceRegistration
is used to store connection information for BLE devices. This is useful, as researchers can pre-register devices when handing them out to lend to study participants. But, in the case where participants use their own devices, they are uniquely linked to a specific participant.The latter issue (
DeviceRegistration
) may not be a big issue when considering the scope of people to consider during reidentification. It may be easy to get a MAC address of a specific individual you are targeting (simple BLE scanning), but it's not trivial to scan every potential person who may have data in the deployments subsystem. More risky could be whenDeviceRegistration
would be used to connect to third-party services, such as Google Fit, to retrieve sensor data.This may make the idea of hosting the studies and deployments subsystems by separate organizations without a legal binding contract to address these issues hard to achieve, unless some further design or infrastructure work is done. E.g., encrypting
DeviceRegistration
andParticipantData
.