Non-traceability of nuisance alarms

huitseeker commented 4 years ago

In #7 we have established that in some conditions a malicious user may provoke a "false alarm" whereby a user is notified of an at-risk status without having had a genuine contact with a diagnosed patient.

The attack (adjusted to make timing valid and remove the need for bribes) seems to work as follows: the attacker places radio receivers in a target-rich environment, (e.g. a grocery store) and captures HELLO messages from targets. Using a means of rapid WAN communication, they send those messages near-instantaneously to a radio beacon that replays them exactly in a contamination-rich environment (e.g. an elderly care home in which an outbreak has been detected, or an hospital). Eventually, a person will be diagnosed positive in the contamination-rich environment, and subsequently trigger the "at risk" status of the targets.

Note the attack is passive in the target environment, and active only in the contamination environment.

Traceability

How could we detect this attack, in the honest-but-curious model?

all the measures that aim at preventing de-anonymization upon upload of a LocalProximityList in §6.1 also go in the direction of preventing the server from recognizing that this upload is "stuffed" with maliciously introduced HELLOs,
the attacker may inject a HELLO message corresponding to an ID under their control to be notified upon firing of the "at risk" event, which gives them warning to retrieve and hide their emitter beacon,
the eventually-diagnosed patients may never realize their report triggered a large amount of false alarms,

User fatigue

A few false alarms are to be expected in the normal course of events. For example, medical professionals are at a high contamination risk. But some may practice such good mask-wearing, hygiene & distancing that they would not pass on the illness despite being a first a carrier and then diagnosed. Such a case would create a "normal" false positive.

But should this repeat too often, this attack can create user fatigue, and lower motivation to get tested swiftly.

Moreover, the "at-risk" user is then excluded from emitting ESRs (see #16) until they get tested. Should they delay their testing, they might miss what would otherwise have been a second (and genuine this time) risk signal.

Contact with an infected patient, in an illness with a high attack rate, might happen frequently to everybody, whereas contact with a number n>1 of infected patients would be a more acute (and presumably less noisy) signal. Yet the current protocol is by design not delivering that signal.

In a decentralized protocol

In several decentralized protocols,

the attack above exists, but is reversed: the malicious attacker reads the beacons in a contamination rich environment and instantly replays them in a target-rich environment.
the signals of risk (pseudonymous events) are common knowledge between at risk individuals. As such, they can voluntarily disclose to their social circle (message neighbors, family) the nature and number of the signal(s) which is(/are) prompting them to get tested.

Should most of this community not test positive upon acting on the same signal, the context helps pin it on the low risk represented by that particular signal.

Should there be an unexpected pattern to these nuisance alarms (e.g. a grocery store), communities more easily become aware of the issue, and can be leveraged to investigate the context of these bad signals.

PRIVATICS-Inria commented 4 years ago

Hello @huitseeker. Your attack seems valid, but it implies some equipment. The anti-replay protection of ROBERT is not sufficient to protect against this attack (it tolerates a few seconds of delay). That’s part of the limits, yes.

Regarding DP3T with Google/Apple API support, I don’t think any user may be in position to analyse his proximity history: it’s kept secret within the OS, that’s the principle if I understood correctly.

huitseeker commented 4 years ago

Your attack seems valid, but it implies some equipment.

The point of collection ("target rich") requires a long-range bluetooth sniffer, and a cheap processing unit (e.g. Raspberry Pi) with WAN egress (e.g. SIM card). Cursory consultation of various consumer retail websites hints at a budget under 100 EUR, less if bought in bulk.

The point of emission ("contamination rich") requires about the same equipment — leading to a total budget under 200 EUR — and as indicated above, the equipment is reusable if retrieved upon first alert, to avoid detection.

This attack (originally reported by @vaudenay) can trigger false alerts to a whole school, a factory floor, military barracks, or a government building, with no active radio emission in the target-rich environment.

Regarding DP3T with Google/Apple API support, I don’t think any user may be in position to analyse his proximity history: it’s kept secret within the OS, that’s the principle if I understood correctly.

If you assume this limitation is intact, you also negate the possibility of the "nerd attack" by @vaudenay §5.2 (https://eprint.iacr.org/2020/399.pdf).

More to the point, another benefit of a decentralized protocols is that decentralized disclosure limits those replay attacks.

Say that you have one website for the disclosure of contaminated identifiers per city, and that the attacker executes this replay attack across city borders. For instance — since in the decentralized case, capture and replay location are inverted — the attacker replays the identifiers captured in a contamination-rich environment in Paris to their target in a target-rich environment in Lyon. The contamination would be disclosed on the Paris publication endpoint. Users residing in Lyon would either not know to consult this website, or — in the case they are among the majority of users who did not travel to Paris during the contamination period — immediately detect this risk alert as spurious.

There are naturally other trade-offs to decentralized disclosure (e.g. size of the disclosure pool and relative anonymity therein), but the principle applies to any disclosure boundary.

In the centralized model with federation (where ESRs are routed to foreign servers, rather than redirected), there's no geographical limit to this replay attack — thus all contamination-rich sites have to be scanned for fraudulous emissions.

ROBERT-proximity-tracing / documents