A Privacy Concern and Requirement on Verification Process

auschmidt commented 4 years ago

I realize this is somehow re-stirring the discusion of #147 initiated by @togermer. I don't think a resoulition was achieved there. In #147 a proper, privacy-protecting approach was also described which goes back to DP-3T development and exists as PoC. The present contribution corroborates the issue, aims to add to the discussion, and to present further examples for approaches to resolution.

Threat Description

By the architecture and protocol design of the verification process, both Verification Server (VS) and Corona-Warn-App Server (CWAS) can identify the Corona-Warn-App (CWA) uniquely within the protocol, since TAN is unique – which is of course necessary at some point to effect the verification. If VS and CWAS are under authority of or directly associated with the network operator (MNO), MNO and CWAS, respectively, VS can collude to identify a user submitting a valid (TAN, Diagnosis Key (DK) ) pair uniquely, i.e., they can identify infected persons. The ensuing privacy threat is described in the GDPR DPIA of CWA , attacks A3 and B4.

Privacy Rationale and Motivation

The threat is not minor in terms of practical import, since identification of positively tested users can be done by the MNO automatically and on a mass scale, in very short time, for instance on an administrative order. It cannot be assumed that infected users are aware or would consent to their being identified by the MNO, nor can it be assumed that their identities are somehow common knowledge in the CWA system. Other than public health authorities, the MNO is also neither legally nor otherwise entitled to obtain such data, not even by co-operating with these authorities in the CWA system. Also the scale on which personal data can be proliferated is different from a large number of regional health authorities holding them, to an MNO having access to them centrally. For the sake of the common and shared goal of broad acceptance of the CWA, I kindly ask the implementors to consider this concern seriously.

The issue lives at the intersection of technical design and real-world impact. Some people may be more seriously concerned about it than others, but it is not clear that it would be of negligible concern to everyone. The privacy implications of CWA are discussed in many different places, such as the very insightful and recommendable Forum Privatheit, but such is out of scope here. The explicit and sole intention of this contribution is to foster adoption of the CWA, as currently implemented.

Threat Mitigating Requirements

The threat can be mitigated to an (IMHO) acceptable level by following minimal-need-to-know design principles in the verification process and modify the separation of tasks between CWAS and VS slightly to fulfill the following two requirements, which are here raised as a privacy requirement on the verification process:

A. The verification process shall be such that one of the server entities CWA or VS (CWA preferably) is not able to easily identify the CWA involved in the protocol. B. The other entity (VS preferably) shall operate it under the authority of a separate TTP, not associated with the MNO.

Note that A is an enabler for B and only if both are implemented, then the threat is mitigated on an architecture/protocol level. The designers have taken the decision that "Deutsche Telekom is providing the network and mobile technology and will operate and run the backend for the app [...]". The backend includes CWAS and VS and thus may rule out implementing B. In this case the examples for alternative processes become moot, but the privacy concern persists.

However, operating VS and CWAS under the same authority affiliated with one particular MNO is just a design decision which can be discussed as any other. From @tklingbeil 's answer to #206 there clearly is awareness and aptness to apply the principles of separation of duties and minimal-need-to-know. But with the above decision, the designers have not fully done so. Institutional controls, for instance included in a law governing specifically CWA, may be another alternative to alleviate the concern. But then it remains a fact that this design decision necessitated that additional institutional control. This is exactly what security-by-design approaches try to avoid or at least minimize.

Mitigating Alternative Processes

To exhibit the fundamental possibility of mitigating the privacy concern by modified verification protocols and/or architectures in order to satisfy requirement A above (and then of course, implement B.), four examples are presented in the following. These examples are just examples and are not meant as suggestions or proposals for implementation. The first is a protocol variant that tries to stick as close as possible to the existing one and minimize modifications (although, unfortunately, it requires changes at all involved entities CWA, VS, CWAS). The second introduces a further indirection for stronger task separation such that VS never receives DK. The third and fourth – a blunt architecture change, and a hint at a de-centralization option – are only superficially alluded to.

Note that, obviously, there are countable infinitely many other variants of architecture and protocol satisfying the same functional requirements, with different security properties and privacy implications. Before beginning, some initial considerations. The functional requirement of the verification process is

Precisely the DK of all valid submissions shall be placed in the CWAS database

Applying the minimal-need-to-know principle to this statement immediately entails that CWAS does not require having an association to the uniqueness of the CWA instance submitting the DK in the run of the verification procedure. Only VS requires this. As long as the CWAS receives valid all and only the valid DKs from some entity, this fulfills the desired function.

Example 1. Protocol Modification

The aim of the protocol modification is to disassociate the CWA instance from the run of the verification protocol from the viewpoint of CWAS. This is achieved by making the protocol stateless between VS and CWAS and by camouflaging, for CWAS, all data which could be associated to the CWA instance. If that is achieved, CWAS is still in the position of an attacker who could apply traffic analysis (in this case, of its own protocol communication) to re-associate a received verified DK from VS to the particular CWA instance which submitted it. To obstruct this, it is assumed that CWAs submit sufficiently many decoy messages to CWAS. The protocol is kept to the simplest, bare minimum.

Entities and roles:

CWA, which attempts to submit DKs after having received a positive test result, i.e., the valid case, to the CWAS database.
VS, which validates that the submission originates from a CWA with a valid test result.
CWAS which receives submissions and shall place only validated DKs into its database.
Other CWA instances, submitting decoy and valid messages to CWAS.

The protocol description starts at and replaces step 13 in Figures 3 and 4 of the architecture documentation:

VS sends an ephemeral key e and a token t to CWA. VS rotates e, t after N, M protocol runs.
CWA encrypts DK with e to Enc_e(DK).
CWA sends Enc_e(DK) and t to CWAS.
CWAS sends Enc_e(DK) and t to VS.
VS validates t (whether it matches one of its issued t), finds the associated e, and
waits for a random number of further messages from CWAS, possibly including decoy messages,
decrypts DK, and
sends DK to CWAS. Some remarks on the properties of the protocol and further variants which may improve security and other properties:
- CWAS cannot directly associate DK received from VS with the previously submitted Enc_e(DK). However it could do so when the response from VS is received immediately. To impede this, step 6 is introduced. If there is a continuous stream of decoy submissions by CWAs, it will then be only possible for CWAS to associate an answer from VS with any of the previously received submissions, with low probability, which decreases as more submissions accumulate. Note that only a random waiting period by VS is necessary, not a fixed minimum as this will suffice to hide the valid submission in question in the stream of all submissions. Therefore also the delay introduced necessarily by step 6 may be acceptable.
- Rotation of e, t after N, M operations of VS with CWAs serves, if N, M >> 1, may serve to make it difficult even for VS to associate a DK with the specific CWA which submitted it. However, this compromises the desired feature of double-spending protection (one-time token), which is restored if N or M=1. If both properties are desired, double-spending protection can still be independently achieved by introducing another TTP intermediary who adds a one-time token to the submission data and with whom VS checks freshness of said token. Furthermore, rotation of e, t can be forced on VS by introducing another TTP which provides these data (possibilities for indirections are infinite).
- DK may be end-to-end encrypted from CWA toward CWAS, so that VS also may not know them.
- Although not explicitly formulated, the encryption-decryption with e appears as symmetrically herein. Asymmetric encryption can be used instead, of course.
  Example 3. DK Filter Intermediary
  
  To allow for a further indirection which effects a stricter separation of tasks such that VS never receives DKs, not even in encrypted form, an intermediary, herein called DK Filter Entity (DKFE) may be introduced. The sole task of DKFE is to provide CWAS with precisely all valid DKs. As a prerequisite, it is assumed that DKFE has a pre-established trust relationship with VS. As a concrete example, it is assumed that VS and share a cryptographic key derivation function, KDF. This may for instance be a common function such as PBKDF2, and the trust relationship may be represented by a shared password P as input. The password P and some other data element shall be the input to KDF, for instance by concatenating them or using the other data element as the salt input to PBKDF2.

VS and DKFE also share a dictionary Dict of validation words w. Validation strings are chosen at random by VS from that dictionary and used in each validation procedure only once, and then are depreciated by removing them from the dictionary, which effects double-spending protection. VS and DKFE may need a procedure to refresh their shared dictionary, when it reaches exhaustion.

Now the procedure may run as follows.

VS creates two tokens, transport token tt, and validation token vt, randomly. VS choses w from Dict at random. VS creates two keys a transport key et and a validation key ev as et=KDF(P,tt), ev=KDF(P,vt) (one key, one purpose). VS encrypts w with ev to Enc_ev(w).
VS sends ( tt,vt,Enc_ev(w), et ) to CWA.
CWA encrypts DK with et to Enc_et(DK).
CWA sends ( tt, vt, Enc_ev(w), Enc_et(DK) ) to CWAS.
CWAS sends ( tt, vt, Enc_ev(w), Enc_et(DK) ) to DKFE.
DKFE creates ev=KDF(P,vt). DKFE decrypts w with ev and looks up w in Dict. If w is found then DKFE creates et=KDF(P,tt), DKFE decrypts DK with et, depreciates w from Dict,
waits for a random number of further messages from CWAS, possibly including decoy messages,
sends DK to CWAS, and
sends vt to VS.
VS upon receiving vt from DKFE, finds associated w and depreciates it from D.
Example 3. VS as Data Gateway

In a different split of separation of tasks, VS may act as a gateway to filter data for CWAS. This way, CWAS would have no contact with a submitting CWA in the verification. That is CWA submits DK together with RegToken to VS who forwards it to CWAS. However, a different design decision has been taken, perhaps with the intent to not let VS receive DKs at any stage, in any way or form (perhaps even encrypted). This was raised as a question in issue #206 , and the answer provided thereto shows that this strict task and data separation is what the designers had in mind. Alternatives respecting the latter strong requirement have been sketched above.

Solution Example 4. Completely De-Centralized Verification

It is clear that the verification task can also be done by the CWA itself. In the simplest case, each CWA would receive all DKs submitted to CWAS from it, calculate certain numbers of TEKs, and look for a match. Although this certainly entails computational burden on the CWA, database load at the CWAS, and communication overhead, as also all decoy and rogue submissions need to be stored, transmitted, and checked by CWAs, it is not clear how much that would impact in practice. Also, in a de-centralized solution, there might be ways to reduce the overhead, for instance by CWA submitting received TEKs together with DK, to facilitate matching.

Final Remarks / Disclaimer

Let me make it clear herein – not that it would be necessary to evaluate the foregoing technical contribution – that this author is in favor of broad use of the CWA. In fact, I came across this when writing a statement to urge people to use CWA, with data protection in mind. However the state of the matter for the CWA user right now is: “If You press this button, the MNO will very easily be able to know You are infected.” To me, this is annoyingly detrimental to public acceptance of the CWA.

Let me also make it clear that I do not, in any way, insinuate any intent of the authors of CWA regarding this issue, quite the contrary. I think the design decisions in this case were simply due to a design pattern with task separation in mind and thus a direct mapping of the well-known TAN procedures to the functional task, and maybe by that, a slight oversight of the minimal-need-to-know principle.

The content of this contribution reflects solely the author’s views and not the views of any organziation the author is affiliated with. The author has no particular interest, financial or otherwise, other than explicitly stated herein. The technical procedures provided herein in no way constitute suggestions, proposals, or requests for implementation from the author. The technical procedures described in this contribution are, to the best knowledge of the author, not protected as intellectual property by anyone in any form.

Shield:

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Internal Tracking ID: EXPOSUREAPP-1919

mh- commented 4 years ago

@auschmidt You state that "If VS [...] are under authority of or directly associated with the network operator (MNO), MNO and [...] VS can collude to [...] identify infected persons."

I think this is correct (even though I personally do not think that it will happen) - but I don't see how the proposed protocol modifications could prevent this.

From here we know that "Deutsche Telekom is providing the network and mobile technology and will operate and run the backend for the app [...]". (Backend being both VS and CWAS.) Given these prerequisites, do you think that a protocol modification could securely prevent user identification assuming a malicious collusion of e.g. Vodafone as MNO and Deutsche Telekom as operator of VS and CWAS?

IMHO institutional controls will have to prevent this.

auschmidt commented 4 years ago

@mh- Of course the example protocol will only alleviate the privacy concern if my requirement B: separate out VS and operate it elsewhere, is also implemented. Operating VS and CWAS under the same hood affiliated with one particular MNO is just a design decision, not a "prerequisite", and it can be discussed as any other. From @tklingbeil 's answer to #206 I can see that there is awareness and aptness to apply the principles of separation of duties and minimal-need-to-know. But with the above decision, the designers have not.

Now we have to discuss what may or may not happen in the real world, because that is what this is about and that is where privacy lives. You may have good reasons to "not think that it will happen" from Your experience but others may think differently from theirs, see for instance here. As clearly stated, I am interested in the highest possible adoption rate of the CWA, therefore I'm raising the issue so that the concern can be alleviated by real-world measures. Institutional controls may or may not be strong enough, they yould be part of a law, for instance, which would be good. But then it remains a fact that a design decision by the implementors necessitated that additional institutional control. This should be avoided in a security-by-design approach. As shown, there are alternatives.

Further, please note that the protocol examples provided do not constitute a "prosposal", as You say, they are just examples of alternatives, as I clearly state in the final remarks.

fwagner5 commented 4 years ago

Just a small clarification: ISP / MNO is for Germany: Telekom Deutschland GmbH, Backend Operations is done by T-Systems International GmbH - two separated legal entities. And yes, both companies are part of the Deutsche Telekom Group. But, you have another layer of segregation of duties.

sventuerpe commented 4 years ago

@auschmidt If the MNO has no legal basis for the processing of the data identified in your scenario, what would motivate the MNO to break the law?

OberstK commented 4 years ago

@sventuerpe it does not need to be a breaking of laws. An order by authorities as enabled by GDPR legislation would be possible as well.

auschmidt commented 4 years ago

@sventuerpe Thank You for giving me the opportunity to expand on the issue. The statement – from the viewpoint of risk analysis - hidden in your rhetorical question is that the risk of the mentioned collusion attack is extremely low, or zero, because strict institutional controls are in place in the form of data protection law which will be upheld in any circumstance and will not be overruled. In the Corona-crisis, evidence to the contrary has emerged in many instances. To examples relating to the statement above are the direct delivery of data of infected persons to the police and, more alluding to how standing law may be executively neglected, a telling statement by Austrian chancellor kurz. Much more informed and detailed discussion can be found in Verfassungsblog, Forum Privatheit, digitalcourage, etc. More generally, the view expressed in the statement above would not be valid in a professional security and risk assessment of information technology because it is a blanket which neglects all circumstance. If this and many analogous statements could be taken face value, we could drop work on all that security stuff and turn to more useful occupations. We also would not need to use fancy approaches such as security and privacy by design. I do not subscribe to that and rather align with the open letter, the [CCC open letter)( https://www.ccc.de/system/uploads/300/original/Offener_Brief_Corona_App_BMG.pdf) and the CCC requirements. The – from my viewpoint – other extreme of the spectrum is the risk assessment in the [GDPR DPIA] (https://www.fiff.de/dsfa-corona-file-en/at_download/file), which, coarsely speaking, considers the risk exposure of infected persons high for both the central and the de-central CWA approach. As should have been clear I take a more granular view. I think that the risk is residual and will hopefully not materialize in ‘normal’ circumstances. But ‘normal’ here is gets us on shifty ground. So all that @togermer in #147 and I am raising is that issue: The risk could have been mitigated to a (to me) fully acceptable residual level by a privacy-by-design approach. We both provide conceptual proof of that statement. Since the perceived risk a strong determining factor for the acceptance of the app, this is a nuisance. My intent was to give wholehearted advice to people to install and use the app (honestly, I was checking in late just to take a look at the design when writing something up to that end). The usefulness of the CWA has been hypothetically discussed and sharply criticized for instance by saying that frequent false positives would certainly occur and lead to people becoming insensitive to warnings. Similar concerns had been raised against facial protection (false safety). Both arguments are patronizing people as being rather sheep-like and in the case of the face mask this view has been overcome. One is pretty sure now that face masks help at scale and people are able to use them consciously. CWA may be less equally or more useful at different stages of the epidemic than the face mask but of course it should be tried out. Moreover so, since the action principle is totally different from the face mask. Therefore, simply speaking, CWA has a chance to be an independent factor to other protection measures. Since the epidemic, especially at low infection rates and R close to 1 is always near a tipping point, adding one more factor, even of relatively low efficacy, may be surprisingly effective. In my view, CWA (future versions will be doubtlessly improved in many ways) can be very useful in this and coming epidemics. Back to the privacy issue, the honest risk assessment that I could give right now is, in plain language: Install the app, use it normally, receive warnings and act on them responsively. Your data is very well protected in this. If You are infected and face the question whether or not to submit a warning then do it if everything is relatively calm and authorities currently have no problem with dealing with the epidemic (and will not have in the near, foreseeable future, e.g., one week ahead). But if the situation is close to panic, lockdown imminent or already in place, and hopefully you still have the choice and are not forced to submit your status, then think twice – You could still resort to that ultimate security measure: smartphone gratinée.

BrittaTSI commented 4 years ago

We will check this topic and consider if it makes sense to add it to our backlog for future releases. @SantiagoJay

Ein-Tim commented 2 years ago

Is there any update available at this issue @tklingbeil / @BrittaTSI?

corona-warn-app / cwa-documentation