Clarify "Honest-but-curious" terminology

defeo commented 4 years ago

Section 1.4 states that

The authority running the system, in turn, is ”honest-but-curious”. Specifically, it will not deploy spying devices or will not modify the protocols and the messages.

Then, later:

The server does not learn the identifiers of the infected user’s App but only the EBIDs contained in its LocalProximityList (list of Ephemeral Bluetooth Identifiers she was in proximity with).

and

Given any two random identifiers of IDTable that are flagged as “exposed”, the server Srv can not tell whether they appeared in the same or in different LocalProximityList lists (the proximity links between identifiers are not kept and, therefore, no proximity graph can be built).

However, section 6.1 acknowledges that the opposite is true:

A LocalProximityList contains the EBIDs of the devices that the infected user has encountered in the last CT days. This information together with the timing information associated with each HELLO message could be used to build the de-identified social/proximity graph of the infected user. The aggregation of many such social/proximity graphs may lead, under some conditions, to the de-anonymization of its nodes, which results in the social graphs of the users. Would that be a concern, it is necessary to ”break” the link between any two EBIDs contained in the LocalProximityList to prevent the server from getting get this information.

and then goes on listing several possible ways to prevent the honest-but-curious server from linking EBIDs and constructing the proximity graph, including:

mixnets,
crossing fingers and hoping that NAT translation will unlink the data,
using trusted servers,
trusted hardware.

Solution 1 is vague, and can thus not be analyzed. Solutions 3-4 are based on trust (on the network operator, on the hospital server, on the hardware), and thus do not fit in a honest-but-curious model. Furthermore, as the upload authorization procedure is not specified, it is impossible to tell whether the server can or cannot link EBIDs.

I strongly suggest you replace the "honest-but-curious" bit with "trusted" as long as these components are not fully specified.

Gu1nness commented 4 years ago

In my opinion, the definition of «Honest-but-curious» is given directly in the document. And it is terrible :

[It] might use collected information for other purposes such as to re-identify users or to infer their contact graphs. We assume the back-end system is secure, and regularly audited and controlled by external trusted and neutral authorities (such as Data Protection Authorities and National Cybersecurity Agencies).

Clearly, this is a breach of privacy, and IMHO not compliant with the GDPR for the moment. But I will check with some lawyers I know which are specialist on this subject.

defeo commented 4 years ago

I understand that sentence as "Its goal is to use collected infromation for...", and the goal of a proof in the honest-but-curious model would be to prove that those goals are unachievable. That's the standard cryptographic meaning of the term at least.

Admittedly, the documentation is unclear at best, and self-contraddictory at worst.

Gu1nness commented 4 years ago

Sure, I completely agree.

marcespie commented 4 years ago

The main problem with this is that this assumption is unnecessary

Google's protocols for instance, do not assume an honest server.

There's also ample historical evidence that this is a bad idea. Breaches will happen.

Finally, and more importantly: this makes the server a prime target for attacks, because it actually holds non anonymous data!

Even if we can trust the governmental instance, that problem alone is enough to discredit that assumption completely.

PRIVATICS-Inria commented 4 years ago

I strongly suggest you replace the "honest-but-curious" bit with "trusted" as long as these components are not fully specified.

Thanks @defeo for the suggestions, we will take them into account in a future version of our document.

superboum commented 4 years ago

I strongly suggest you replace the "honest-but-curious" bit with "trusted" as long as these components are not fully specified.

Thanks @defeo for the suggestions, we will take them into account in a future version of our document.

Will at the same time Inria publishes an erratum about this misleading assertion that is still publicly available on their website ?

Une telle application n’est pas une application de surveillance : elle est totalement anonyme. Pour être encore plus clair : sa conception permet que PERSONNE, pas même l’Etat, n’ait accès à la liste des personnes diagnostiquées positives ou à la liste des interactions sociales entre les personnes. La seule information qui m’est notifiée est que mon smartphone s’est trouvé dans les jours précédents à proximité du smartphone d’au moins une personne qui a, depuis, été testée positive et s’est déclarée dans l’application.

Because anonymity against the state will be enforced by trust and not design.

marcespie commented 4 years ago

Because anonymity against the state will be enforced by trust and not design.

This assumes fuck-ups won't happen. All of us who have designed actual software know how well this kind of plan fares in the face of adversity.

There should be some mitigations against errors. Specifically, even if the trust gets violated (maybe because of implementation errors and targeted attacks), it shouldn't be possible for a bad guy to get away with the full set of contact data.

Even if you buy into the "let's trust the State" dogma, I don't think a centralized approach can be a good compromise.

PRIVATICS-Inria commented 4 years ago

Thank you all for your feedback. Even if parts of the definition of “honest-but-curious” (HBC) could be improved, we still believe it’s globally correct. No, we do not trust the authority more than what is said in the first part of the second bullet of section 1.4. Yes, this HBC adversary model considers that the authority may want to do more than just processing messages as described in the specifications. This is why we try to do our best to design a protocol that protects users’ privacy. This is our responsibility.

Regarding the relevance of the HBC model (see Issue #2): this is a key assumption for the ROBERT v1.0 design as you noticed. It is not our responsibility, as privacy researchers, to judge whether or not this assumption is valid.

This topic could be discussed for hours, clearly. However, when looking at the “avis CNIL sur le projet d’application mobile StopCovid”, we have the feeling this is a reasonable assumption.

Gu1nness commented 4 years ago

Excuse me but the CNIL clearly states that «En outre, la CNIL reconnaît qu’elle respecte le concept de protection des données dès la conception, car l’application utilise des pseudonymes et ne permettra pas de remontée de listes de personnes contaminées.». Which means that the app souldn't allow to deanonymize people, which is for now completely on the contrary of the HBC defined in the v1.0 of ROBERT imho

marcespie commented 4 years ago

Regarding the relevance of the HBC model (see Issue #2): this is a key assumption for the ROBERT v1.0 design as you noticed. It is not our responsibility, as privacy researchers, to judge whether or not this assumption is valid.

basically, this all but says that this is a political decision that you have no control on, and that you have to work with.

I understand how you might be obligated not to say so explicitly.

ROBERT-proximity-tracing / documents

Clarify "Honest-but-curious" terminology #13