Open kholtman opened 4 years ago
Hi @kholtman, Thank you for reaching out. I think some of your concerns about our position vs centralised architecture are addressed in our FAQ(c.f. P1, P2, and P7). Additionally, some of the example you mention are answered in our "Security and Privacy Analysis" document. Does this answer your questions ? Best,
@s-chtl Thanks, I have read these and have commented on them elsewhere in the comment sections, see references below, and these documents are not answering my reviewer questions.
To clarify why I am asking you for further clarifications of your proposal:
With respect to FAQ 1 and 2, see my comments and those of other independent reviewers in #169. I feel you need to do more to explain your reasoning here, my problems with your current explanation are stated in the comments I made on the GDPR and data minimisation in #244.
FAQ 7, as I have tried to explain above as best as I can, is for me a definition. You are using this definition to state a preference for a design direction, but I do not understand the reasons for your preference, so I am asking you above to clarify the deeper drivers of your reasons.
I have been seeing over the last week that most independent peer reviewers invited by you to this repository comment section have started to converge on a strong consensus that PEPP-PT is a better security protocol than DP-3T, given concerns expressed in issues like #169. This worries me.
A partisan preference for any of these two projects, or the fact that the German government has picked one of them for a German-only app, should not be a factor in an open peer review of a proposal for a pan-EU protocol, to be used only later when cross-border travel restrictions can be relaxed as hoped.
In review discussion contributions like my PQR/XYZ comment in #208, and my description of the Dutch experience in #224 , I have been actively pushing back on the idea that an independent expert peer review of DP-3T, as a candidate EU-wide protocol proposal, can already be concluded now or should be abandoned at this point.
So I would like to continue my review, and as an independent peer, I am asking for further clarifications. As an open source project, you have the right to respond to requests made to you with a will-not-address decision to close. But this would mean that my peer review work on your project proposal, wearing an academic hat, would have to come to a stop.
To give a more technical comment on the level of maturity reached by the the security risk and threat analysis that has been done so far: both by the project and by its reviewers, including myself: more work is needed, we are only staring at the tip of the iceberg.
The protocol design and the reviewing process are still stuck too much in the analytical frame or mind of Schneier's Applied Cryptography. This is creating a tunnel vision that is not seeing the issues clearly yet.
Personal data is being extracted on a massive scale in today's world, but not because of man-in-the-middle attacks on SSL.
Smart phones are not the trusted and fully secure communication end platforms that are conveniently assumed to exist when one is analysing a secure wire protocol that wants to avoid having to trust the Man in the middle.
If Alice and Bob are supposed to be perfect strangers, what mechanism is creating the mutual trust that they are both using the exact same end-to-end protocol, with exactly the same application layer software on top? Any candidates for that mechanism? Maybe the best candidate for a trust creation mechanism here is the organ of society most associated with creating and enforcing trust: a democratically elected government? This internal contradiction has not been resolved yet.
How much help is Applied Cryptography as a framework when we need to also identify and discuss privacy risk vectors like 0-day attacks and phone rooting?
We are talking here about a set of questions that require an academic framework and mode of thought like that presented in Schneier's Liars and Outliers. A useful book offering an academic framework more suitable to analysing the role of Apple and Google in all of this is Platform Revolution by Parker et al.
All current pan-EU protocol candidate designs still have a huge amount of work ahead of them, for their teams and for independent reviewers. Trust creation is the main issue here -- we are talking about a voluntary adoption rate of 60% (as a first approximation -- more study needed), by an EU citizenry which has not forgotten EU history.
@s-chtl With respect to the analys document you mention, the Apr 21 version of https://github.com/DP-3T/documents/blob/master/Security%20analysis/Privacy%20and%20Security%20Attacks%20on%20Digital%20Proximity%20Tracing%20Systems.pdf
I like the clear structuring of the analysis. But here is what I am missing: if I want to reach a conclusion I agree with you, as a reviewer taking a purely academic stance, that a centralised architecture is too risky, and that a decentralised system architecture (also using your FAQ 1 and 2 choices), is the superior alternative, I will have to assign an impact scores to each risk mentioned in the document. Each impact score would be calculated as 'size of harm suffered if attack succeeds/`probability that attack succeeds'.
The project's strong statements, that a centralised approach is too risky to even consider, would then make me assume that some of the 'size of harm suffered if attack succeeds' factors you are using, for a risk that has a probability of happening with PEPP-PT but has zero chance of happening with a decentralised architecture, would be extremely high, e.g. the huge loss of human dignity suffered by Alice in my 3rd user harm story.
Yes, I am bending over backwards to maintain an academically objective stance here. This stance causes me to ask you, in the request above, please identify the attack scenario and the 'size of harm suffered' value you are using -- I call this a centralised system user harm story.
Hi @kholtman, thank you for your clarifications and thank you for taking the time to review our protocol. It is indeed immensely valuable that people review it and raise questions.
I understand now better what you were asking for and we will be looking into it.
I added the tag "without further input" to help us triage the issue threads and keep track of things.
Thanks again for your comments and we will look into it !
@s-chtl Thanks for pinning, thanks for your kind words!
@s-chtl A further observation: when I first read the April 3 document https://github.com/DP-3T/documents/blob/master/DP3T%20-%20Data%20Protection%20and%20Security.pdf , my impression was: good first start, is aware of legal theory and the technique of using GDPR case law, but is is applying legal tests that are way too weak for medical applications, so I will need to make some comments.
As an aid for the reader, here is he legal test defined in the April 3 project document:
To underscore the data protective nature of these measures, it is worth noting that the re-identification test set out by the CJEU in Breyer ( C-582/14) as necessary to classify this as personal data would not be met. Firstly, establishing an effective side-database would likely require breaking the law by surveilling individuals without an effective lawful basis (e.g. illegitimately using covert cameras directed outward from the person, see Ryneš (C-212/13)). In Breyer , the Court noted that the test of means reasonably likely to be used to identify a natural person would not be met ‘if the identification of the data subject was prohibited by law’.
So for me, this test is too weak to be applied to medically sensitive data. (I used to work for a company that makes both medical and non-medical products, and my experience there is what is calibrating my too-weak-test-for-medical opinion here,)
Reading this line in the April 3 document was one of the factors that triggered me to open #169 , where I introduced the 'improved app' attack model. I commented there on the need for a stronger legal test, maybe too obliquely:
Though these attacks are typically illegal under the GDPR and/or other laws, this in itself is not enough to conclude the attacks will not happen. What matters is the equation that compares the immediate benefits to the attacker(s) against their risk of getting caught and being brought to justice. This couples to the ability of EU governments to effectively enforce the GDPR in the worldwide app economy and advertising/tracking ecosystem, and this ability is unfortunately very low.
For the first 2 months after the initial launch of a national app, my risk estimation for attack scenarios 1 and 2 in #169 would be low: the bad actors mentioned will need a lead time to develop their tools, so probability of occurence starts low and increases over time. So #169 is something that could be postponed a few months, if needed, but it does mean having the plan and the ability to roll out a protocol upgrade that raises the protection level offered for these #169 attacks, if and when these attacks start to lead to a mass production of privacy violations that cannot be ignored.
To end on a note of cognitive dissonance again, it strikes me today, on re-reading the Breyer legal test defined in the April 3 DP-3T document, that if I look at my understanding of the PEPP-PT architecture through the lens of this Breyer legal test, the centralised server of PEPP-PT would pass the Breyer hurdle easily. So again, taking an academic stance, what am I missing? This brings me back to asking about harm stories.
From my point of view, @kholtman is adressing a very important point with this issue.
I am a hobby programmer with no experience in security protocols. I have read the whitepaper, the "Security and Privacy Analysis" and the FAQs and I came to the conclusion that DP-3T sounds like a good idea.
However, I am struggling to argue against centralized aproaches like PEPP-PT in the public discussion. What's the worst that could happen if my country implements PEPP-PT?
Give me THE harm story, which convinces the broad public (not the scientific community) that implementing PEPP-PT is a really bad idea.
Thanks in advance for all your efforts, I am a big fan of this project :)
This is an answer to @bavcol
I think Apple and Google brought up this argument in their dispute with Germany, UK and France: We (Apple and Google) cannot and will not make the decision, whether a government is trustworthy. So therefore either every country is allowed to store the data central or no country is allowed to. And it cannot be in your (EU countries) interest, that we supply suppressing gov. with more tools (central storage) to surveil their people.
Same goes for DP-3T: There is no need to create a tool for suppressing gov.
@verschneidert Thanks for summarising your views on the Apple/Google dispute with Germany, UK and France: this did not play in the Dutch local media so I have limited information about it. For example, I'm not sure if these companies ever used the words 'centralised/decentralised', or if this was just the press describing their position.
Some thoughts on the Apple/Google stance
In the past I tried to avoid posting specifically about the views of these companies, but here goes....
When I try to dig beyond stories in the media and Twittter sentiment, what is is interesting is that these two companies recently (at a time between April 22 and April 24, if I look at it via the Wayback Machine) added a clarifying FAQ to their website on the initiative https://www.apple.com/covid19/contacttracing There is some interesting food for thought in there. Quoting from the v1.1. version of the FAQ that is up there now:
The system [the API developed by the Apple and Google initiative] is only used for contact tracing by public health authorities apps
[Questiion 6] Will governments have access to the information facilitated by this technology?
[Answer 6] [...] Access to the technology will be granted only to public health authorities. Their apps must meet specific criteria around privacy, security, and data control. [,,,]
So these companies seem to want to make a distinction between 'public health authority use of the data' and 'government use of the data'. As public health authorities are run by governments (and by some NGOs, but I'll ignore that detail in the analysis that follows), apparently a distinction is being made between the 'public health authority part of the government' and the 'other parts of the government', who must not be able to use or access the technology or, presumably the (social graph) data it produces, or the app will be booted from the app store. At least, this is how I am reading the message of the FAQ.
One can wonder how Apple or Google could ever check or enforce such an app store policy, and what the legal status would be of any attempt to do so, especially when applied to a country where the government has just declared a national emergency. But I will not go there. What I am interested in is the concept of 'non-health-authority part of the government'.
Does 'non-health-authority part of the government gets access to the data' make any sense as a user harm story?
As a user harm story, 'non-health-authority part of the government gets access to the data' sounds weak to me. As a test case, recently in the Netherlands, the experts from the public health authorities (the GGDs) have basically been directly in control of a lot of government functions, even though from a technical/legal point of view, they have only been giving advice to those who have actually been elected to wield power. For example, a few weeks back, local public health authorities were effectively controlling how friendly or unfriendly police enforcement of social distancing 'suggestions' would be. This was data-driven enforcement based on the local health authority estimates of the local number of infected people, local R0, etc. These health authorities were basically controlling whether the police (actually a kind of auxiliary police with limited policing powers, who patrol public spaces like parks and shopping areas) would be giving just friendly warnings to disperse, or quite punitive fines, e.g. to random groups of young people they encountered in parks. (It is generally expected in The Netherlands that there will be civil rights challenges to these fines in the courts.)
Given the above test case, I do not think that the distinction 'public health authorities' and 'the rest of government' is very helpful. More generally, I can see no way that Apple and Google (or DP-3T for that matter) can write effective usage criteria, or make effective technical designs, that would block the data exhaust from the app from being used as a 'tool of a suppressing government', while still allowing the data exhaust to be used as a 'tool of a non-suppressing government'.
This would be especially impossible if the app is integrated into the traditional WHO-promoted manual contact tracing process. Frankly, even if it is completely stand-alone and apart from that process (as implied in the latest long-read version of the project cartoon?), the mere step of getting tested by the public health authorities and then being given a passcode that can be used to trigger the next step in the app process will produce a data exhaust. There is no way such testing can be completely anonymous. And in fact I would very much hope that the public health authorities will try very hard to convince anybody who tests positive that they should participate in a manual contract tracing interview to locate all the strangers they have met in the last N days who might not have been using any app.
Related views
A somewhat related view (in French) about data exhaust is here: https://risques-tracage.fr/
| Le traçage anonyme, dangereux oxymore
The following is an interesting analogy from a US source https://theintercept.com/2020/04/02/coronavirus-covid-19-surveillance-privacy/
| Data Collected for Covid-19 Should Be Walled Off, Like the U.S. Census
So the above US-context activist statement calls for partially legal solution to concerns raised by a tracking app, not a purely technical one. The problem with these legal solutions is that a government, under emergency powers when responding to a pandemic, may construct a case that it has a right and duty to break some of its own (medical or other) privacy or confidentiality laws. To solve this problem, the only way out that I see is thay government officials step up to give clear clarifying statements on national TV, about what they will and will not be doing with the data, given the emergency situation. These statements should have sufficient force to they can be taken seriously -- different countries will have different civil traditions about how such forceful statements could be made. (When such government statements are absent or too evasive, this massively of complicates GDPR analysis -- the GDPR forces one to look at data exhaust too, in fact that is one of the main useful features of the GDPR. Related to this legal conundrum: see also my comments in #224 about how fundamental human rights were identified in the Dutch appathon as a concern that needed to be resolved.)
But I do kind of like 'leads to government suppression' as an example user harm story, and here are some others
That being said, what I am seeing now (13 May) in the public discourse, in the media across the EU, is that the level of sophistication in activist statements about the handling of the crisis is improving a lot (compared to say around April 20). What I am seeing is much more activist statements which are focussing on the entire lock-down and lifting of lock-down restrictions, not just the question of privacy, or database architecture, in a contact tracing app. Speaking with my 'systems requirements gathering project phase' technology expert hat on, this means that the end user stakeholders are getting more sophisticated in being able to express what they really want. This is good news for open source projects that are developing contact tracing app infrastructure.
Sampling Dutch media, I see activist user harm concepts like 'too much suppressive government enforcement' (as mentioned above by @bavcol ), but also 'the young are being made to pay the price for the protection of the old' (not sure what an app design could do about that), 'the parties in power are helping their own friends first' (again not sure about an app here). and 'the government should start to get much more serious about moving to an outbreak management process that is more democratically transparent and accountable' (this seems like something an app design could help with), and `we need to move away from blunt tools (like a country-wide lock-down) and towards more refined tools' (sounds like an app).
Conclusion
The real contribution, in my view, that DP-3T is making to prevention of the `government suppression' harm story, and also the above 'not democratically transparent and accountable' harm story, is that DP-3T is acting as a model of an open and transparent process.
Reading tweets and press coverage, I understand that the DP-3T project takes a strong position against centralised architectures for Bluetooth contact tracing app protocols. But I am having deep trouble understanding what the project means when it uses the word centralised. Surely this is not just an objection to a small difference in data flow?
It would help me if the project clarifies its stance by creating a user harm story. I just made up that term, so I will define it
Definition: Centralised architecture user harm story: a story where some harm is done to centralised system user Alice, ending with the statement: the DP-3T project believes strongly that this should never happen.
Here are three harm story examples to illustrate what I am looking for.
Example 1: not the harm story I am looking for
For a very long time already, the default digital privacy activism harm story has been this one:
act 1: Alice is surfing the web
act 2: Personal data related to Alice is stored in a giant database server
opinion: We are privacy activist group Bob, and we strongly believe that this is evil and should never happen
My problem with this story is that when Bob tells this story to Carl, a non-tech-savvy member of the general voting-age public, Carl will end up thinking:
Moving on....
Example 2: inverting the DP-3T engagement cartoon to create a harm story
(Note for readers: cartoon is here: https://github.com/DP-3T/documents/tree/master/public_engagement/cartoon -- see bottom of page)
I like the DP-3T engagement cartoon much better than story 1 above because it tells not a harm but a gain story. At the end, both of the main characters have done something altruistic for the common good.
But when I invert this cartoon into a harm story, I think I am getting this:
act 1: Alice and Bob are being altruistic
act 2: Personal data related to Bob is stored in a giant central database server run by an EU government
opinion: We are the DP-3T project, and we strongly believe that this is evil and should never happen
Carl above would end up being relieved again: if this is the worst that can happen.... I am not sure yet what to think. Are you seeing a type of risk that I am still blind to?
Example 3: user harm story, loss of human rights
(I am deliberately writing this one to be technically infeasible on the Bluetooth side. What is about to happen would be completely against human rights and Dutch democratic values.)
act 1: Alice and Bob are being altruistic. They are sitting 1.4 meters apart on a park bench.
act 2. The apps in their phones register a distance less than the government recommended 1.5 meters!
act 3: Alice gets a phone call. It is the police, who see on their social distancing enforcement dashboard that she is only 1.4 meters away from a stranger. Doesn't she know that there are fines for reckless sitting? They will let her off with a warning this time.
opinion: The Dutch believe strongly that any use of contact tracing event data by the police should never happen.
Conclusion
Dear DP-3T project team, I am confused about what you mean when you say that centralised systems should be avoided.
Can you please clarify your concerns about centralised architectures, preferably using centralised system user harm stories?