corona-warn-app / cwa-documentation

Project overview, general documentation, and white papers. The CWA development ends on May 31, 2023. You still can warn other users until April 30, 2023. More information:
https://coronawarn.app/en/faq/#ramp_down
Apache License 2.0
3.28k stars 344 forks source link

Broadcasting a list of infected persons is not GDPR conform #102

Closed Covid19Fighter closed 4 years ago

Covid19Fighter commented 4 years ago

Describe the bug

The whole concept of this server is to store a list of infected COVID19 persons and send it to everyone. I know that the list is supposed to be anonym, but locally you can match the IDs to a person. Even a beginner software developer will be able to modify the mobile app (Open Source and API are a good thing, but they are transparent and you can modify them) and store GPS and timestamp with the key and you don't have to upload the new app to the Store, so Google will not be able to check this. When the list of infected COVID19 patients (or their Ids or Keys even if you encrypt everything 100 times) is sent to the server the new app will be able to find out where and when it detected the infected person. If such a modified app is the distributed you can even create a whole database of keys.

This huge data privacy leak was already metioned on the DP3T paper: https://github.com/DP-3T/documents/blob/master/DP3T%20White%20Paper.pdf

"Infected individuals. The centralised and decentralised contact tracing systems share the inherent privacy limitation that they can be exploited by an eavesdropper to learn whether an individual user got infected and by a tech-savvy user to reveal which individuals in their contact list might be infected now. However, the centralised design does not allow proactive and retroactive linkage attacks by tech-savvy users to learn which contacts are infected because the server never reveals the EphID s of infected users."

The so called "retroactive linkage attacks by tech-savvy users" is a huge problem!

Furthemore the whole thing is not GDPR conform and is not conform to the EU recommendations: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32020H0518&from=EN

(16) | With particular regard the use of COVID-19 mobile warning and prevention applications, the following principles should be observed: (1) safeguards ensuring respect for fundamental rights and prevention of stigmatization, in particular applicable rules governing protection of personal data and confidentiality of communications; (4) effective cybersecurity requirements to protect the availability, authenticity integrity, and confidentiality of data; (5) the expiration of measures taken and the deletion of personal data obtained through these measures when the pandemic is declared to be under control, at the latest; (6) uploading of proximity data in case of a confirmed infection and appropriate methods of warning persons who have been in close contact with the infected person, who shall remain anonymous; and (7) transparency requirements on the privacy settings to ensure trust into the applications.

Expected behaviour

No Broadcasting of infected Keys combined with Open Source and Open APIs.

Steps to reproduce the issue

Broadcasting of the Keys

Technical details

All OSs, all versions.

Possible Fix

Do not broadcast the Keys.

MikeJayDee commented 4 years ago

Your premise is incorrect. Only approved apps get access to the API, modified apps would not have access. And even if a modified app had access to the API it would not have access to the keys stored on the phone.

This means you would need specialised tools (and not an Android phone or iPhone) to sniff the Bluetooth beacons and store them with time and location. This is indeed possible but would require a lot of effort for little gain (assuming that the number of infected persons stays as low as it is now). And it would clearly be illegal.

kbobrowski commented 4 years ago

It is known as "nerd attack" and probably will be described in upcoming security documentation. Also described in https://eprint.iacr.org/2020/399.pdf

Possible solutions include sending encrypted Diagnosis Keys only to "official" apps, or encrypting Bluetooth communication with frequently rotating key (which has to be synchronized between "official" apps). This would make this attack more difficult, but still can be hacked user-side, since adversary always has access to assembly code. Even moving entire app to closed-source would not solve this issue (although probably would decrease number of people doing this).

@MikeJayDee you can re-implement Google/Apple API based on the open specification, there is no security coming from the fact that only official apps will have access to it. Regarding Bluetooth beacons - any app on Android with granted "Location" access can scan BLE beacons.

Covid19Fighter commented 4 years ago

I have open it also on the google git and they are telling me, the hacker should comply to the ToS. I don't think this will do the job. Hackers do not use to comply to ToS. https://github.com/google/exposure-notifications-server/issues/367

Covid19Fighter commented 4 years ago

And as I told Google, you have to follow the EU recommendations and must prevent such attacks.

Covid19Fighter commented 4 years ago

safeguards ensuring respect for fundamental rights and prevention of stigmatization

kbobrowski commented 4 years ago

Maybe Google/Apple will add some "salt" - behavior not documented in the specification of the API, e.g. some key which they will rotate through Google Play services / iOS update which would be encrypting Bluetooth beacons. I guess we'll see when it is deployed. But as for now I don't see a solution to this problem.

ironjan commented 4 years ago

According to Google's/Apple's FAQ, point 10 location access is disabled when accessing the contact tracing API.

The criteria are detailed separately in agreements that developers enter into to use the API, and are organized around the principles of functionality and user privacy. There will be restrictions on the data that apps can collect when using the API, including not being able to request access to location services, and restrictions on how data can be used.

kbobrowski commented 4 years ago

@ironjan you can re-implement entire app (actually you only need to re-implement Google/Apple API, rest is open-source) and attach whatever you want to each received beacon. Then it'd tell you when and where user was in contact with infected person. Of course this "enriched" app will need Location access.

ironjan commented 4 years ago

Addition: this restriction would not prevent a "multi-app" scenario, i.e. a developer having their own contact tracing app and a second location tracking app on their phone. However, modifying the CWA-app to attach the location is not possible.

MikeJayDee commented 4 years ago

@kbobrowski Indeed, you can do this but not as described in the issue - the issue suggests the API can be used to abuse the system, and that is simply not possible.

kbobrowski commented 4 years ago

@MikeJayDee I'm not sure whether @Covid19Fighter meant to restrict the issue only to the problem of modified app with original API implementation, cannot tell from the way issue is formulated. You are of course right that modified CWA will not have access to implementation of the API inside Google Play services. But re-implementing original API is just a small technicality. Modified app (with own API implementation) won't be of course deployed through Google Play, surely these efforts would be strictly moderated by Google, so the threat is that it could be distributed via APK. Another way to distribute this is as trojan horse bundled with some other app update (app would need "Location" access, related discussion: #76 ). Anyway I think the response from CWA regarding this vector of attack will be described in upcoming security documents.

ironjan commented 4 years ago

As far as I understood the FAQ, it's either contact tracing xor location access within one app.

However, if a developer is using two or more apps in combination, a matching could be possible. I.e. a developer needs (a) an app to collect keys and their timestamps ("modified CWA"), (b) an app to collect locations and their time stamps, and (c) a way to map locations+time to a person. Since (c) is quite easy, the combination of all three could perhaps be a very local attack vector.

Covid19Fighter commented 4 years ago

My opinion is that broadcasting any kind of critical medical information to millions of devices is complete madness. This is the opposite of data privacy. Google and the Swiss guys are telling decentralized=data privacy. This is wrong, you don't want to expose any kind of medical information to millions of devices. No one can keep track of how this information will be used. This is why the main concept of data privacy is to minimalization of critical data. Broadcasting it to everyone is just the opposite. The match should take place on a secured auditable component that matches the list of infected IDs with the IDs of the mobile phones. This component should not persist the request and only answer them. Is the same concept you use for telecommunications. You have call records but you don't store them or delete them following the legislation. On the mobile antennas (which are servers) you also get the data of the mobile phones but you don't store it. Centralized with storage is not good, decentralized is not good, secure auditable centralized component with no request storage is the correct technology here.

andre0707 commented 4 years ago

You can not just build your local app version with location tracking. See documentation:

Before you can develop an app that uses ExposureNotification, you need the com.apple.developer.exposure-notification entitlement. For more information on this entitlement, see Exposure Notification APIs Addendum. To get permission to use this entitlement, see Exposure Notification Entitlement Request.

https://developer.apple.com/documentation/exposurenotification

MikeJayDee commented 4 years ago

When you use the API you cannot get the timestamps of the received rolling proximity identifiers. In fact, you cannot get any rolling proximity identifiers received from the API. You would have to collect the rolling proximity identifiers completely independently of the API or the app. Which is possible, but a different concern from the one you continue to raise.

sventuerpe commented 4 years ago
  1. @Covid19Fighter: I understand your attack scenario as follows: The attacking device running some software participates in the contact tracing protocol just like the genuine, unmodified app. However, the attacking device also collects and stores additional data pertinent to itself such as its own location, and relates these data to identifiers exchanged with other devices via BLE. At a later time, the attacking device receives exposure notifications and uses its local data collection to infer information about the infected person. Is this correct or do you have a different scenario in mind?

  2. Everyone: Are there safeguards against rogue app instances existing entirely outside the iOS or Android ecosystem? In other words, could one develop an independent CWA clone running on a Linux box or similar and participate in the device-device and device-server protocols unhindered and undetected? Mobile OS access control and permissions would be ineffective in this case, and the attacker would be the rogue app user.

Covid19Fighter commented 4 years ago

Yes, this is what I had in mind. You could even create it as Android App, you don't have to put it on the store.

Covid19Fighter commented 4 years ago

I am also asking Google, they are very active on my question but I am not getting an answer I can really check: https://github.com/google/exposure-notifications-server/issues/367 For me, there are only two possibilities.

  1. This is a major attack vector to get local infected persons.
  2. The app is only a kind of mask and everything is happening under the app at Operating System level protected and not Open Source. In this case, this would mean that the concept of Open Source App is only a marketing thing, because the data is processed by Google and Apple Software you can not review.
Covid19Fighter commented 4 years ago

I am not sure that sending a lot of medical critical information to millions of devices running on instransparent Software created by Google (https://www.nytimes.com/2019/01/21/technology/google-europe-gdpr-fine.html) is the kind of European GDPR we are expecting here.

But, as you are the ones broadcasting the medical information from your server, you should be the ones concerned about it, I think.

5% of SAP and DTAG income is a lot of money.

mh- commented 4 years ago

@Covid19Fighter Do you realize that you can edit your own comment to add your new thoughts, using the "..." icon on the top right of the box, instead of posting a new one?

And - yes - of course it's possible to re-implement this on a Raspberry Pi for example, attach a GPS mouse, and walk around and collect RPIs from BLE. And I don't think it's possible to protect RPIs that are distributed to millions of devices from leaking, also not through Device Attestation.

But I don't think this is a real-life problem, because users who do not want this will not upload their RPIs anyway.

Covid19Fighter commented 4 years ago

@mh- Thank you for the re-edit hint. I have restructured.

The problem here is that GDPR is really clear on the right of deleting data and this is impossible if you are broadcasting critical data to millions of devices. This is why you always try to keep copies of critical data to a minimum. Every hacker or data protection person knows this. I really don't get how this concept went that far without someone complaining in Germany. In France the government has complained about it.

mh- commented 4 years ago

@Covid19Fighter Do you believe that when you (Covid19Fighter) send a personal e-mail to millions of people, the GDPR gives you (Covid19Fighter) a right to demand that every single recipient must forget that e-mail?

My point is that you should not send that e-mail if you don't want it to be received. If you do not want your RPIs to be received, then you should not upload them.

Covid19Fighter commented 4 years ago

You are not a server sending millions of private data of third persons on the e-mail. If you did this, yes GDPR would ask for it. This is why you have AVVs and you don't use emails for this kind of communication.

And yes, you are sending your data and you think this is going to happen on a secure way and no one will be able to know you were the one infected. If this is not the case, I am afraid people will not do so. If you told people, if you are infected and you tell us we are going to send an email telling everyone you are infected people will not use the app.

mh- commented 4 years ago

Whether or not the privacy trade-off this system is offering is good enough (in my opinion: yes) - that's a different discussion and I don't think we will ever agree on this. I'm just thinking that you are making false assumptions about the GDPR and the Google Apple Exposure Notification concept. The system will NOT be "broadcasting a list of infected persons"

sventuerpe commented 4 years ago

@Covid19Fighter Thanks for clarifying. I see a few factors and constraints on the attacker’s part that should be considered in risk assessment:

  1. Operation of a single rogue device affects at most those app users coming into physical proximity of that device at some point. The attack does not reveal information about other users.
  2. As a second necessary condition, one of those persons needs to initiate notification at a later time.
  3. Movement of the device is subject to the common constraints of spacetime and transportation as we know it. Travel incurs some cost in time and energy.
  4. One attacker may deploy multiple rogue devices, at some cost (hardware, installation, risk of detection) per device.
  5. Trade-off: Where lots of people pass by the device (e.g., at the entrance or platform of a busy train station), the chance is low for each encounter to actually register as a contact. Where people remain stationary long enough (e.g., a seating area on an aircraft or in a concert hall), a rather small number of people is affected.

What is the best-case profit scenario from an adversary’s perspective? How much can they gain from the attack (a) using a single device or (b) deploying an infrastructure of rogue devices?

As regards possible GDPR violations, I understand the Federal Commissioner for Data Protection and Freedom of Information is being kept in the loop and will have a chance to review the solution before release.

Covid19Fighter commented 4 years ago

@sventuerpe: yes, I am sure the Commissioner will and I hope he understands the problem here. This is why I am opening an issue. He can then take a look at it. @mh: The system is broadcasting a list of ids of infected persons that may locally be used to identify the person. For me this is the same thing. IDs are IDs and if you can join them with personal data you should handle them with care. As I wrote on the other threads I am not against an app. I am against an app that is being developed against common sense, basic rules and mostly only based on no scientific data and assumptions. I think you can create a pretty good solution if you keep to to physics, science and follow the usual security rules. I am here to try to change that with the people involved, but if they continue closing my tickets without any kind of quality check I will write a decent article on this. This way the Commissioner and the persons using the app will know what the real problems are. If I am not doing this, is because I think the app is necessary and the CCC people did enough bad publicity. And for the people that think we should better keep these problems non public to avoid the public not to use the app I can only say. This is not only not legal but it will not work, not because of me but because if this app does not solve these problems at some point in time people will know (too many persons, too many hackers) and then you will never get their trust again.

tkowark commented 4 years ago

Hi @Covid19Fighter,

thanks for bringing that point up and also the lively discussion. Regarding GDPR, please bear in mind, that this channel is mostly targeted at solving technical issues and involved developers are not the right people for giving legally binding answers.

Can we kindly ask you to send your concrete concerns to privacy@sap.com? In the meantime, we will also prepare official statements and add them to the FAQs of the page. Until we have such a statement, we would also lock this issue for further discussion.

Mit freundlichen Grüßen/Best regards, TK Corona Warn-App Open Source Team

SebastianWolf-SAP commented 4 years ago

We have made available the privacy notice for the Corona-Warn-App that will be published with the App in German and English. This privacy notice explains what data is collected when you use the Corona-Warn-App, how that data is used, and your rights under data protection law. We will therefore close this issue.

Mit freundlichen Grüßen/Best regards, SW Corona Warn-App Open Source Team