DP-3T / documents

Decentralized Privacy-Preserving Proximity Tracing -- Documents
2.25k stars 180 forks source link

Question/suggestion: Add provability of at-risk status to security requirements #291

Open KasparEtter opened 4 years ago

KasparEtter commented 4 years ago

Context

I first encountered the "TOTP-based" contact tracing idea after the CodeVsCOVID19 hackathon in the Next Step app by Ubique and thought that this was a really elegant approach. However, for quite some time, I think that this has a serious flaw so I was curious about whether this has been discussed anywhere around the DP-3T initiative in the meantime but I couldn't find anything in this regard. My still limited understanding of the discussions around DP-3T is mostly based on the white paper (12 April 2020 version) and this response to an external analysis.

Concern

There is a lot of focus on the privacy aspects and that only legitimate parties can feed the system with positive COVID-19 cases. Additionally, more and more people (including politicians) are becoming aware that if use of the app shall truly be optional, discrimination against non-users needs to be forbidden by law. (This aspect including who is liable for the costs of quarantine (also of false positives) is discussed in #224 – I haven't read the full issue, though.) So assuming discrimination is forbidden, employers are forced to respect app-alerted quarantines and the government pays for the arising costs (such as the loss of income), what prevents a user who doesn't want to go to work from free-riding with a false being-at-risk claim from a fake app and thereby gaming the system?

Analysis

In my opinion, a user shouldn't be able to claim to be at risk unless the user has indeed been in contact with someone who had been tested positive and entered into the system through one of the legitimate channels. The white paper mentions something in this regard in the paragraph about functional requirements under 1.2:

Integrity: Contact events corresponding to at-risk parties are ​authentic,​ i.e., users cannot fake contact events.

However, as far as I can tell, this aspect is only discussed from the side of the "infected party" (e.g. by relaying EphIDs widely) but not from the "at risk party" side later in the paper. In the "low-cost decentralized proximity tracing" design, anyone can easily modify and then install the open-source app so that it adds fake records with EphIDs derived from the published SKs. (If my concern is taken seriously, an employer would want to see such records and not just (a screenshot of) the "you are at risk" warning page, which would be even easier to fake, of course. In other words, I assume here that this concern is creating problems/doubts in practice and that the official app developers want to add such a "show records" feature.) I don't understand Cuckoo filters yet but the second design seems to be much superior in this regard, which should be one of the main selling points of the second over the first design. My current understanding is that a cheating user cannot make up/find EphIDs that are included in the officially published Cuckoo filter unless it learned one. It is still unfortunate, though, that the learning could come from "malicious" sharing of such EphIDs among conspiring users rather than actual contact (except for one of them).

Mitigation

Besides promoting the second design rather than the first one, a better solution could be to replace hash derivations with digital signatures. The problem with this is probably the constraint mentioned at the bottom of page 9:

Given the completeness requirement, it is necessary that smartphones can observe and record as many ​EphIDs as possible. This precludes the use of connection-based communication between smartphones, as establishing connections limits the amount of exchanges of ​EphIDs​. Instead, we rely on Bluetooth Low Energy beacons. These beacons’ payload is 16 bytes, which technically limits the size of our system’s EphIDs​.

However, instead of using secret sharing to require the detection of several messages over a longer time period, one could probably use error-correcting codes instead in order to accommodate longer payloads. (Please note that my understanding of Bluetooth is really limited, I'm just thinking aloud here.)

If there is interest, I can elaborate on my signature-based approach. In short, the idea would be to derive public keys from an initial "extended public key", then exchange signatures at derivations determined by the epoch and only publish the (health authority-signed) extended public key but never a private key. (See the BIP32 for my terminology about key derivation here.) Ideally, the message that is signed during an exchange would include the epoch-public key of the recipient. This would then also solve:

Such an “online” relay attack is unavoidable in location tracing systems based on connectionless Bluetooth broadcasts.

Last but not least, the employer can be convinced with a simple zero-knowledge proof that the user's app knows a private key to a public key that has been signed with a key derived from an extended public key which has been approved by health authorities.

Limitations

The counterargument to my suggestion is that users could still share an (extended) private key that has been put at risk. This could only be solved by generating the private keys in a secure element in combination with direct anonymous attestation. Alternatively, a user would have to register a derivation of their extended public key with their employer at the start (and generate the master key in such a way that it provably includes entropy from the employer). (And after a quarantine, where the derived public keys of the employee have become linkable, the two parties would do the same procedure again to start anew.) Requiring this upfront and explicit mistrust likely makes this approach socially impractical.

Conclusion

Before we dive deeper into potential solutions and come up with better ones: Does my concern make sense to you? I can very well imagine that, after some consideration, this problem has to be declared to be more of theoretic rather than practical concern and out-of-scope from a technical point of view. It would still be nice to mention this in the white paper (with the reasoning why this conclusion was reached) then, though. Maybe this concern can or should only be considered from a legal perspective and what we have in place against falsification of documents and insurance fraud might already be sufficient to discourage this kind of cheating (as long as people are reminded of this). Please keep in mind that this might not only be a problem in a workplace environment but also in case of (high school) students who want to skip classes (where such a transgression might also be seen as a test of courage).

KasparEtter commented 4 years ago

Only after having written all of the above, I stumbled upon issue #85. Feel free to close this issue then as well but please consider adding the out-of-scope decision to the white paper (and other documentation). My impression is that issue #85 covers various aspects and I would have to think more about what exactly those are (also given that some comments seem to have been deleted there). While it only mentions cheating to get access to better health care and rationed tests, this issue adds cheating the employer or school to the list of motivations. And at least judging from its title, it also focuses on the privacy aspect (without linking to the infected sk) while I suggested to solve the provability problem (allow to prove that one is in high risk state) in the first place.

How would you prevent two hackers (one infected, one not) from exchanging credentials (e.g., an infected hacker shares his IDs to all his friends so they can get medical supplies) without doing identities checks / revealing identities to the backend?

Yeah, as written above, this might be a difficult problem to solve satisfiably but this shouldn't be a reason not to consider/mention it at all. This would at least require you to know someone who was tested positive recently. Even if many users conspire to cheat the system, the positive test would expire regularly (after two weeks?) and someone new needs to get infected in order to keep the scheme going.

timoll commented 4 years ago

Security isn't black and white. If there is a simple viable attack and mitigation changes it to an difficult viable attack, then the mitigation should not be refused because the attack is still possible.

Proving at risk status has to be an important feature of the system. At risk patients will have to quarantine for a certain amount of time. Legal protections will likely be in place for such users. However we want to prevent abuse of these legal protections and a mitigation that makes it harder to abuse these protections should not be hand-waved away because they are not perfect.

I already suggested in #266 to not share all data and reserve some for validation.

With ECDH key exchange (#66) validation could be limited to 2-3 parties as every at risk person has an unique secret with the infected.

lbarman commented 4 years ago

Hi @KasparEtter, thanks a lot for the very detailed input. I saw that you did catch up with what has been said on the other thread (#85). Indeed, this is slightly out-of-scope right now, especially since I don't see an easy solution, it adds many interactions and requirements that we'd have to define precisely, and our bandwidth is very limited. I think this sound like future work :)

timoll commented 4 years ago

Indeed, this is slightly out-of-scope right now, especially since I don't see an easy solution, it adds many interactions and requirements that we'd have to define precisely, and our bandwidth is very limited.

Wouldn't it be very simple to just publish 48-64bits of the EphIDs to everyone and have a server answer validation requests?

Sure, someone who can get access to a full EphID of someone that is at risk would still be able to falsely prove their at risk status. Less so with ECDH key exchange. But that is still much better than the current protocol.

keugens commented 4 years ago

@KasparEtter:

So assuming discrimination is forbidden, employers are forced to respect app-alerted quarantines and the government pays for the arising costs (such as the loss of income), what prevents a user who doesn't want to go to work from free-riding with a false being-at-risk claim from a fake app and thereby gaming the system?

From DP3T white paper:

A shortcoming in most decentralized proximity tracing systems based on the exchange of BT advertisements between devices is that a malicious party who is willing to modify their app or deploy their own software is able to record a proximity event despite only being in contact for a short amount of time or at a long distance . This violates the requirement that the system provide precise data, i.e. only report exposure events that represent actual physical proximity.

Does the app need to be faked at all? The probability for a being-at-risk notification increases with number x duration where the user device is in close contact with other user devices. Not all users wear their smartphone all the time on their body. Sometimes it is left in a bag or a jacket, nearby other bags and jackets, for example.

And a false being-at-risk seems also possible without an app. The waitress in a restaurant may have a bad handwriting and the date is confused, for example.

I think any contact logging cannot be precise in the way that all reported infection situations are certainly real. Mitigation: before going in quarantine, user and health service should have a talk. If the user has no idea about the situation notified by the app, the health service may have the final decision about further actions (quarantine/testing/...).

Dilemma: a privacy-preserving app wants to minimize any data, which could uncover the anonymity of the infected person. But to decide if a situation was real, the user and the health service would like to have more data.