ROBERT-proximity-tracing / documents

Protocol specification, white paper, high level documents, etc.
Other
247 stars 21 forks source link

ROBERT is no better than DP-3T to protect users against "Deanonymizing Known Reported Users" attack #46

Open superboum opened 4 years ago

superboum commented 4 years ago

In many places in ROBERT whitepaper, a comparison is made with an analysis of DP-3T:

Although it might seem attractive in term of privacy to adopt a fully decentralized solution, such approaches face important challenges in term of security and robustness against malicious users. [6]

and

Other, qualified as “decentralised”, schemes broadcast to each App an aggregate information containing the pseudonyms of all the infected users [1]. This information allow each App to decode the identifiers of infected users and verify if any of them are part of its contact list. Our scheme does not follow this principle because we believe that sending information about all infected users reveals too much information. In fact, it has been shown that this information can be easily used by malicious users to re-identify infected users at scale [6].

and

In our design, scores are computed on a trusted server and are used to notify users. While this offers more flexibility to adapt the scoring algorithms as needed and leads to more effective systems, it also increases the resilience of the systems against attackers that aim at identifying infected users: In order to be notified, an attacker must inject his own HELLO messages into a victim’s App in such a way that the risk scoring algorithm in the back-end selects him for notification. This makes such an attack more difficult as it requires an attacker to use invasive tools or put himself at risk, consequently reducing the scalability of such an attack. We consider especially the latter property to be a rather strong deterrent. In contrast, processing the risk of a contact on the phone upon reception of a notification inherently reduces the system to tracing, even for very brief encounters, between users and infected. This has major implications on privacy as using contextual metadata makes it trivial to identify infected users [6]. The system’s design would be based on the fact that all users which at any point ever saw an infected user’s HELLO can now use contextual metadata (such as a meeting date and time) to identify the infected users.


Reading the DP-3T analysis, it seems they always (and only) refer to the "5.2 Deanonymizing Known Reported Users" section:

If an adversary A encounters a user B, A can listen to theEphIDibroadcast then associatethisEphIDias belonging to B. If later B has itsSKtdisclosed, A can deanonymize this keyand learn that B was infected.

One example of this attack

Occasional disclosure. When a user A has its app raising an alert, he may be stressed and behave randomly. He could be curious to inspect his phone to figure out why it is raising an alert. If he knows DP3T enough, or if he finds a tool to do it for him, he would realize the alert is raised because of a series of EphIDi which were collected on the same coarse time on a given date. A could assume that it comes from those EphID′ is come from the same user and that their number indicate a duration of encounter. It may be enough for A to remember about B and therefore deanonymize B.


The same attack can be easily rewritten to fit ROBERT architecture specificities:

Occasional disclosure. User A can modify its application to register a new identity to the back-end server each time she meets a new user and log when the identity was created/used. Now, the user has as many identities as person seens and can independently probe the back-end server to know if one of her profile has been in contact with an infected person. As each profile is mapped to only one user and is associated with a timestamp, it may be enough for A to remember about B and therefore deanonymize B.

The same rationale can be applied for all the other attacks in this section (Paparazzi attack, Nerd attack and Militia attack).


Limiting the profile registration (ie: preventing Sybil attacks) would be needed to prevent an attacker from deanonymizing users. However, the mechanism presented is too weak to protect against even basic Sybil attacks:

A proof-of-work (PoW) system (like a CAPTCHA) is implemented in order to avoid automatic registrations by bots or denial of service attacks (the details of this PoW system are out of scope of this document).

Indeed, it is cheap and fast to hire micro-workers via platforms like Amazon Mecanical Turk to solve CAPTCHAS. But even more simply, it is not too long to solve ~10 CAPTCHA per days (just try to browse Google websites behind Tor to be convinced). Furthermore, many CAPTCHAs today work by collecting lot of data on the user behaviour and relaying on the fact the user is logged on the CAPTCHA provider services.

I claim that the only option would to require users to connect on the service via the FranceConnect portal. It would then de-anonymize users totally from the authorities. It would then break the defined threat model by ROBERT:

Anonymity of users from a central authority. The central authority should not be able to learn in- formation about the identities or locations of the participating users, whether diagnosed as COVID-positive or not.

To conclude, I don't understand why ROBERT would better protect users' privacy than DP-3T as the same attacks (with a slight variant) can be applied to both protocols.

vincent-grenoble commented 4 years ago

Hello,

This is a major question that deserves more than a small comment. We are working on a more detailed privacy risk analysis document... More to come when we'll be done (it's a matter of days although I can't tell you how many because it also depends on non-maskable interrupts ;-).

In any case, take care of the adversary model since most of the conclusions we can draw depend on this model: e.g., are you ready to trust your neighbours more than the central authority, or the opposite? The two approaches rely on opposed adversary models. There is no single answer as it depends on local considerations (my answer as French citizen will not be the same as that of a person living in a non democratic country). As far as I am concerned, I have no trust in users, and I do not think that any solution (and DP3T has this issue) that would enable any user know the infection status of another person is reasonnable.

I've been told that the EDPB just produced guidelines about those systems, I definitively need to have a look at it: https://edpb.europa.eu/our-work-tools/our-documents/guidelines/guidelines-042020-use-location-data-and-contact-tracing_fr

Cheers, Vincent

superboum commented 4 years ago

First, thanks for your answer and your consideration :-)

As far as I am concerned, I have no trust in users, and I do not think that any solution (and DP3T has this issue) that would enable any user know the infection status of another person is reasonnable.

I don't understand your point. If you learn the pseudonym of infected users via the public list of infected users only, you learn nothing about infected users.

However, if you have crossed one of the infected user, the same deanonymization attack can be done on ROBERT as I explained here:

User A can modify its application to register a new identity to the ROBERT back-end server each time she meets a new user and log when the identity was created/used. Now, the user has as many identities as persons seen and can independently probe the back-end server to know if one of her profile has been in contact with an infected person. As each profile is mapped to only one user and is associated with a timestamp, it may be enough for A to remember about B and therefore deanonymize B.

Could you be more precise on why you think the attack I mentioned on ROBERT is fundamentally different than the one from DP-3T?

superboum commented 4 years ago

This issue is about how well authors' proposed solution, as written in the document, is better than DP-3T in term of privacy for end users. There is no ideology in this part, I am just pointing inconsistencies between parts of the very same document. I claim that, considering ROBERT current adversary model, ROBERT is no better for end users than DP-3T as the same class of attacks can be conducted against end users.

Moreover, I claim that ROBERT can't become better for end users compared to DP-3T without changing ROBERT current adversary model (in other words, without modifying ROBERT spec and addressing this issue) by adding a blind trust in the authorities (ie: it is impossible to provide end users anonymity and state anonymity at the same time). Then, it would heavily contradict the official communication and would require special care and transparency while communicating in the future.

To put it in a nutshell: ROBERT is vulnerable against end users attacks like DP-3T. Fixing this issue may be possible but would lead to important changes in the advertised attack model of ROBERT.


While you seems OK with changing ROBERT attack model, especially by requiring a blind trust in the state, I would also like to add that France has a poor record in term of illegal surveillance. Some examples:

claustres commented 4 years ago

I am not a security expert and I might be wrong but in my humble opinion the debate about if it is reliable to trust your neighbors more than the central authority or the other way around is biased. The questions to ask should be who is more likely to have the power to massively hack the system and which one will result in a massive leak ? It appears to me that 99.99% of the time my neighbors don't have the technical skill to do so while the central authority has it. Moreover, hacking in a decentralized way could leak a couple of users while hacking a central server could leak thousands of users.

Please tell if I am wrong as I would not like to add confusion to the debate being a "newbie".

Julien-Blanc-tgcm commented 4 years ago

It should be noted that in the case of Robert, many informations which could be learnt by the central authorities may as well be already known by other means.

For example, a not-so-honest central authority can deanonymize users. But it could also already have this information from other means (for example, from the medical records of its citizen, if it is state operated). This point must be taken in consideration during the analysis.

This is also related to #38 , the threat analysis is currently lacking several scenarios (it is for example currently assuming that the central authority wont get compromized, but that assumption cannot go in the real world).

As for the “one contact“ attack, i believe there is just no general solution (it is a valid use case of the application). The only way to go is mitigating it by making the cost of account creation higher. CAPTCHA in this regard may not be enough (and they may pose their own privacy problems).

FrankGrimm commented 4 years ago

Funny argument, none of my neighbors have a wild history of inadvertent or malicious abuse of data under their control. More importantly, while the elderly lady one apartment over might be somewhat curious, I'm fairly certain she does not attempt to build a database of my social graph or that of random people in cities a few hundred miles away.

Greetings from another democratic country though.

superboum commented 4 years ago

Just to mention that DP-3T designers share my analysis on the "Deanonymizing Known Reported Users" attack:

Capture d’écran de 2020-04-23 13-13-50

Source: https://github.com/DP-3T/documents/blob/master/Security%20analysis/Response%20to%20'Analysis%20of%20DP3T'.pdf

JeromeLacan commented 4 years ago

I don't see what is funny in the Vincent's argument. As soon as "decentralized" applications will be deployed, I'm sure that some guys will propose to share the locations of the "infected" EphIDs. Simple data analysis should allow to connect the EphIDs of a given person and to rebuild his/her travels. Some web sites will be able to say whether there were an infected person yesterday at 10 a.m. in your local marketplace. Of course, the objective will be "just to protect" the people :(

superboum commented 4 years ago

I don't see what is funny in the Vincent's argument. As soon as "decentralized" applications will be deployed, I'm sure that some guys will propose to share the locations of the "infected" EphIDs. Simple data analysis should allow to connect the EphIDs of a given person and to rebuild his/her travels. Some web sites will be able to say whether there were an infected person yesterday at 10 a.m. in your local marketplace. Of course, the objective will be "just to protect" the people :(

Please read the issue carefully, its point is to highlight the fact that the attack you describe can be conducted with ROBERT by creating fake identities and collecting one EphID per identity... To rephrase it, centralized apps do not protect you more than decentralized apps against the attack you describe.

Moreover, don't pose a false dilemma: we don't have to choose between ROBERT or DP-3T. Other systems or no system at all are also possible choices.

JeromeLacan commented 4 years ago

I don't want to choose between ROBERT and DP-3T, but I just wanted to highlight that "militia"-like attacks (as the one described in the "5.2 Deanonymizing Known Reported Users" section) are easy with decentralized approaches. Of course, the server can do the same with centralized solutions, but you can't compare these two deanonymizations. Personally, in the current situation, I think that a non-controlled deanonymisation in your neighborhood is much more serious than a deanonymisation by the central server (which can have the same information by others ways...).

veale commented 4 years ago

Of course, the server can do the same with centralized solutions, but you can't compare these two deanonymizations. Personally, in the current situation, I think that a non-controlled deanonymisation in your neighborhood is much more serious than a deanonymisation by the central server (which can have the same information by others ways...).

A quick intervention for clarification: the DP-3T analysis of ROBERT is not stating it is it server carrying out the attack and swapping the adversary. We show in our analysis of ROBERT (section 'Learning Infected Close Contacts') that in ROBERT, other users can learn the infection status of each other. The adversary in that section of the paper remains other users. We separately, in the same document, consider deanonymisation attacks by the server.

In our guidebook of attacks, we explain why this is a risk you have to accept with all digital contact tracing systems, including ROBERT and DP-3T similarly (see Inherent Risk 1, p5). It would be misleading to characterise identification by other users, not the server of the infection status of other users, as a problem that only decentralised systems such as DP-3T face.

vaudenay commented 4 years ago

@veale I think the document you mentioned is misleading. Its conclusion is severely biased (for this attack and some others).

superboum commented 4 years ago

@veale I think the document you mentioned is misleading. Its conclusion is severely biased (for this attack and some others).

Do you have some references that you could point to to better understand your assertion?

vaudenay commented 4 years ago

In DP-3T, the attack can be done on a wide scale and is undetectable. You identify N targets for free.

In ROBERT, thanks to accounting and authentication (the good part of centralization), the attack does not scale and could be mitigated by auditing. To identify N targets, the adversary needs N dummy accounts. (That is why accounts are destroyed when the at-risk flag is raised.)

beng-git commented 4 years ago

My understanding is that in ROBERT the accounts are not destroyed but "frozen" and can regain their ability to query once the individuals have been tested and proven negative, so technically you could identify several individuals with a single account, especially if e.g. raising the atrisk flag gives you priority for a test.

aboutet commented 4 years ago

@beng-git Section 2.2 - exposure status request: " If this score is larger than a given threshold, the bit “1” (“at risk of exposure”) is sent back to the App and her account is deactivated, otherwise the bit “0” is sent back."

deactivated means the user has to create another account.

beng-git commented 4 years ago

ok thanks for the clarification, but section 7 / server processing / 3 says : "It means that App AppA cannot perform any new request until user UA is tested and her status updated." so this leads to think that the account can be reactivated is she is tested negative (rather than recreated). Am I missing something ? Although I suppose this doesn't change much anyway.

superboum commented 4 years ago

In DP-3T, the attack can be done on a wide scale and is undetectable. You identify N targets for free.

In ROBERT, thanks to accounting and authentication (the good part of centralization), the attack does not scale and could be mitigated by auditing. To identify N targets, the adversary needs N dummy accounts. (That is why accounts are destroyed when the at-risk flag is raised.)

I have not seen this part described in the reference document. As I described in my first post, the only proposed mechanism of "accounting and authentication" in ROBERT I have identified is a CAPTCHA. Did I miss other ones?

Can an authority performs accounting and authentication while preserving anonymity? How it will modulate the security models you proposed? Will the following assertion will be still true?

Une telle application n’est pas une application de surveillance : elle est totalement anonyme. Pour être encore plus clair : sa conception permet que PERSONNE, pas même l’Etat, n’ait accès à la liste des personnes diagnostiquées positives ou à la liste des interactions sociales entre les personnes.

source: https://www.inria.fr/fr/contact-tracing-bruno-sportisse-pdg-dinria-donne-quelques-elements-pour-mieux-comprendre-les-enjeux

Finally, I fear that we are discussing an alternative and hypothetical version of ROBERT that does not exist yet when we speak of "accounting and authentication by the central server". Indeed, @vaudenay said that such mecanism would prevent multiple account creation but we saw that the ones presented are useless.

Can we acknowledge that, for now, ROBERT has no (effective) accounting and authentication mechanisms and introducing them would probably require to relax the considered attack model?

vaudenay commented 4 years ago

I think we are here to contribute to ROBERT, no? We just realize that there are already measures to protect against this attack (deactivating accounts after alert and captcha at creation). We could try to make suggestions to have more?

superboum commented 4 years ago

I think we are here to contribute to ROBERT, no?

Personally, I am here to assess the risk of ROBERT and to be sure that its claims can be trusted. I think that our brilliant old democracies deserve an enlighten and informed debate, especially I think our fellow citizens must be extensively informed of the liberties they will concede to perpetuate the confidence in our institutions and especially in our researchers.

We just realize that there are already measures to protect against this attack (deactivating accounts after alert and captcha at creation).

These measures are just noise in our discussion. The CAPTCHA can be easily bypassed and would leak data to a 3rd party (or would be really useless as bots are better at solving basic CAPTCHA than humans). I don't understand why deactivating accounts would prevent against "Deanonymizing Known Reported Users" attack.

To move forward, we must agree that ROBERT is lacking a proper accounting and authentication mechanism to be effectively better than DP-3T on the "Deanonymizing Known Reported Users" (I recall that this attack can be conducted by any user so it's not a question of trusting authorities or not).

So, starting from now I consider we ackowledge this point. Now, I would like to mention that I proposed a solution in my first post:

I claim that the only option would to require users to connect on the service via the FranceConnect portal.

This is the only option because accounting and authentication mechanism is just another "Sybil attack" problem. It is a widely studied problem in scientific litterature. This is the main problem encountered by blockchain networks. I would like to emphasize that control based on the network addressing namespace (ie. IP address) or some phone identifier would not work, as they do not map to a user (some users own thousand of them, some others do not own even one).

Based on this survey Sybil Defense Techniques in Online Social Networks: A Survey (2017) and the fact to be effective on ROBERT the mechanism must prevent users from creating as low as 10 identities (so probabilistic solutions would not work), I claim that the only option from the following tree extracted from the survey is "manual verification":

Capture d’écran de 2020-04-28 12-59-28

Finding a solution that does not require a "manual verification" is a research subject on its own and should probably presented in its own article as it would be a contribution on its own.

vaudenay commented 4 years ago

Thanks for the comprehensive survey of possible measures against sybil attacks. I agree the captcha is not sufficient. But it is already something. A token from the health authority could be another way. (Like the token to report after being diagnosed.)

I agree with the need to clarify security claims to the public. I can see that none of the proposed measures would defeat deanonymization attacks in DP-3T. The DP-3T analysis quickly says deanonymization is inherent to contact tracing so DP-3T should not be blamed for having such attack. IMHO this is misleading.

JeromeLacan commented 4 years ago

Just to be sure to evaluate correctly the multi-account attack on ROBERT : if one of the accounts is evaluated "at risk of exposure" by the server, the attacker can deanonymize correctly the infected ID only if this is the ONLY ID collected in the time-slot associated to this account. Is it correct ?

superboum commented 4 years ago

Providing users a token would enable an health authority to deanonymize users interactions with the central authority given that both entities are the same or could (be forced to) collaborate.

It leads me to the following conjecture: it is impossible to enforce at the same time exposure anonymity and authority anonymity.

where:

where byzantine refers to the fact that actors can act dishonnestely. Byzantine Fault on Wikipedia.

Edit: While ROBERT's attack model refers to a subset of byzantine faults ("honest-but-curious" - which is also criticized here #38), most attacks mentioned can still be conducted on this model. So we could rephrase: authority anonymity refers to the impossibility for a honest-but-curious set of authorities

beng-git commented 4 years ago

A hypothesis of the ROBERT attack model is that the health authorities do not collude with the central authority. However, both may be (honest-but) curious. Health and central authorities do not have byzantine behaviour.

In my opinion, byzantine hypothesis on authorities is not realistic, since being detected or proven to be at fault would be an issue for authorities in the real world. However, I think that the covert adversary model, concerning authorities should be used.

superboum commented 4 years ago

@beng-git You published your message before I added that part about "honest-but curious" in my previous message. I added that point because ROBERT document mentions a "honest-but-curious" attacker. Considering this attack model and with the token/authentication hypothesis, most mentioned attacks can still be conducted.

Could you point to some references about the covert adversary model please?

beng-git commented 4 years ago

Yonatan Aumann, Yehuda Lindell: Security Against Covert Adversaries: Efficient Protocols for Realistic Adversaries. TCC 2007: 137-156

Also I'd add that my understanding is that all authorities are honest-but-curious and do not collude. Also see #13

veale commented 4 years ago

Just to be sure to evaluate correctly the multi-account attack on ROBERT : if one of the accounts is evaluated "at risk of exposure" by the server, the attacker can deanonymize correctly the infected ID only if this is the ONLY ID collected in the time-slot associated to this account. Is it correct ?

Or if the attacker knows all other IDs to be from devices that did not report as infected (e.g. other devices they control).

(Regarding other points, if people are interested in the details of DP-3T's mitigations against this attack which have not been covered in this thread, I recommend they look at the secret-sharing section in the White Paper, which effectively prevents the reidentification issues functioning at scale or from a distance (e.g antennae and drive-by collection are rendered impossible or impracticable), and the Cuckoo filter extension).

superboum commented 4 years ago

If we suppose an authentication/a token provider, we have, additionally to the already existing Central Authority (CA), a new authority: the Authentication Authority (AA). Now let's consider 4 attack models: Trusted, Observer (similar to the "honest-but-curious"), Covert (as proposed by @beng-git) and Byzantine against 3 attacks from the point of view of the AA:

✅ : no attack identified ❓: an attack may be possible ❌ : an attack exists

((1)) CA does collude CA does not collude
Trusted
Observer
Covert ❓ *1
Byzantine ❓ *1

*1: AA can impersonate the targeted individual while doing the request to the CA. It might be possible to circumvent such attacks by adding public-key cryptography (AA would then sign users' public key but users would authenticate on CA with their private key)

((2)) CA does collude CA does not collude
Trusted
Observer
Covert ❓ *2
Byzantine ❓ *2

*2: AA can bruteforce the request on the CA while impersonating users. When the infected user reports people that were exposed, both identities will report as exposed at the same time. As it will only be noticeable by the CA and not the civil society, depending on your "covert" definition, it can be considered as a valid covert attack. It might be possible to circumvent such attacks by adding public-key cryptography.

((3)) CA does collude CA does not collude
Trusted
Observer
Covert ❌ *3
Byzantine ❌ *3

*3: AA can inform the CA that its target identity has been exposed and trigger an exposure notification on the target's phone. The CA or the civil society has no way to detect or prove the attack. It could be used as a way to assign people to house arrest. it has been done recently in France

IMHO: patching ROBERT is not that easy and even in presence of non colliding authorities powerful attacks will be possible with a covert attack model.

aboutet commented 4 years ago

If you met only one person and then you receive an at risk notification, you can infer if a user is declared infected. This risk is common to both approaches (centralized and decentralized).

Colluding between users with contacts intersections can also lead to the same result.

however there is an important difference omitted in the DP-3T analysis as @vaudenay said.

In other words, in DP-3T, publishing information about all infected users in the system allows any user to infer which contacts have been infected, to monitor areas (e.g., compute the number of infected users in a building). The impact can be important with the stigmatization and harassment of infected users. This attack only applies if you assume that some of the users cannot be trusted.

In Robert, this risk is mitigated by the fact that the adversary has to create an account for each contact he wants to infer the infection status. Mechanisms would need to be implemented to limit the scalability of this attack.

The fundamental difference between both approaches seems to be the considered attacker model.

veale commented 4 years ago

however there is an important difference omitted in the DP-3T analysis as @vaudenay said.

It is not omitted, it is on page 14-15 of this document, alongside its mitigations.

In other words, in DP-3T, publishing information about all infected users in the system allows any user to infer which contacts have been infected, to monitor areas (e.g., compute the number of infected users in a building). The impact can be important with the stigmatization and harassment of infected users. This attack only applies if you assume that some of the users cannot be trusted.

Monitoring areas in DP-3T with Secret Sharing is only possible if you observe that user for an extended period of time, such as 10 minutes, and is not possible for a distance, or at scale in a building without a large network of sensors, due to the degradation of BLE to below that possible to reconstruct from secret sharing. Without this, you do not collect their identifier.

JeromeLacan commented 4 years ago

Secret sharing will significantly reduce the efficiency of the system if honest users need to receive the signal of an infected user during 10 minutes to collect enough shares to rebuild its EphID.

ainar commented 4 years ago

In other words, in DP-3T, publishing information about all infected users in the system allows any user to infer which contacts have been infected, to monitor areas (e.g., compute the number of infected users in a building). The impact can be important with the stigmatization and harassment of infected users. This attack only applies if you assume that some of the users cannot be trusted.

In Robert, this risk is mitigated by the fact that the adversary has to create an account for each contact he wants to infer the infection status. Mechanisms would need to be implemented to limit the scalability of this attack.

As an end-user, differences between an app built on a decentralized and centralized protocols are invisible, isn't it? I mean, most of the users would not be able to differentiate an app built with DP-3T and an app built with ROBERT.

For both protocols, the main feature is the same and both have to give the same result to the end-user, i.e. be notified if we have a risk of having the COVID-19, and this risk is computed according to the proximity with infected people.

I assume that most of people will be aware of this main feature only. And there is no need for a complex attack to suspect someone else, after receiving a notification. Sure, if you meet a lot of people, it could be difficult to know who could have infected you/who is newly infected. But actually, as the protocol is based on distance and rate, this reduce a lot the set of "suspicious" people. The filtering is easy doable:

This may be a more general discussion about any contact-tracing system, but with this main feature, I do not think that any change on the protocol would prevent "the stigmatization and harassment of infected users". Yes, it would have many false positive, but still, many people have will to blame.

pierreN commented 4 years ago

my answer as French citizen will not be the same as that of a person living in a non democratic country

It seems the discussion can be sum up with this sentence - IMHO ROBERT is way too optimistic about the current state of our democracies (and even assuming a state is "honest", just look at states track records to securely administer a single server.. to convince you, have fun looking at websites with the same certificate as interieur.gouv.fr )

I'm just a concerned citizen and in no way an expert - but in this day and age this is beyond me how can one advocate a centralized solution over a decentralized one. Especially reading https://eprint.iacr.org/2020/399.pdf all arguments against DP3T seems to still holds against ROBERT.