Protect infected users against Bluetooth monitoring: ECDH key exchange

a8x9 commented 4 years ago

Protocol improvement proposal which protects infected users against an adversary recording Bluetooth beacons.

Each user generates an EC key pair (secp128r1) at regular intervals
Broadcasted Bluetooth iBeacon UUID contains the public key
- x coordinate
- discard the sign of y and only use half the points in order to fit in 128 bits
When a user detects a proximity contact, they do an ECDH key exchange between their private key and the other user's beacon
User A
- private key: S_A
- public key (beacon): G * S_A
User B
- private key: S_B
- public key (beacon): G * S_B

When they come in proximity, each user does an ECDH key exchange and stores a hash of the resulting point on their phone:

User A: H(G S_B S_A) = V_AB
User B: H(G S_A S_B) = V_AB

They now have a shared value V_AB that they can reveal in case of infection. This value cannot be linked back to their broadcasted public key, so an adversary passively recording Bluetooth beacons cannot de-anonymize infected people.

De-anonymization would require an active adversary constantly broadcasting their own key in order to create proximity contacts with nearby users. As a proximity contact is only registered after a certain exposure time, the attacker would need to constantly flood the target area with the same Bluetooth beacon which is likely to be detected.

As the EC key only has ~63 bits of security, an adversary with a significant computational power could also recover the private key. But this is not doable at a large scale, and randomly breaking a few beacons is unlikely to reveal a key used in a published infection.

Note: this does not require any active communication. Each user only broadcasts a single 128-bit UUID in any given time-frame, and listens for beacons broadcasted by other users.

Advantage compared to current protocol Users do not reveal their broadcasted beacons in case of infection.

Disadvantage compared to current protocol Significantly higher bandwidth requirements, similar to sharing all EphIDs. Cuckoo/bloom filters could be used to reduce data size but it will stay orders of magnitude higher than the current proposal.

alixMougenot commented 4 years ago

The PK is renewed upon communicating positivity (Page 9 as of today). Because the identity is renewed, only past contacts with the user can be identified. The user EphIDs are identifiable from the first day of excretion up to the day of declaration where the key is renewed.

a8x9 commented 4 years ago

I probably did not explain the goals and advantages of this proposal clearly enough.

The goals and advantages of this proposal are:

Unlinkability between the information published in case of infection and the information broadcasted via Bluetooth. I.e., an adversary recording and storing all Bluetooth traffic cannot later determine who got infected and who did not.
Only people broadcasting next to each other can later determine that they were in contact. E.g., an adversary records and rebroadcasts your Bluetooth beacon in a different place to a set of target users, you later get tested positive and publish your list of shared ECDH values, these target users will not wrongly be alerted about getting in contact with you.

cascremers commented 4 years ago

Hi, now revisiting this question, this seems to be answered in the second paragraph of this FAQ entry.

a8x9 commented 4 years ago

I don't think P4 of the FAQ applies to this proposal.

This proposal does not require multiple packets (advertisement packet content stays the same during an epoch). And it does not require establishing connections (just using the advertisement packets).

So still N broadcasts and not N^2. And no problem with having to group together packets emitted by the same device, unlike the new secret sharing design, see #156.

galadran commented 4 years ago

Hi a8x9,

As you note and the FAQ answer mentions, using DH keys in this manner does ensure the attacker must actively transmit a Bluetooth message rather than passive.

However, this comes at a considerable cost in terms of the reliability of the underlying Bluetooth protocol (both devices must receive each others broadcast, perform a DH operation and store the result in order for it to succeed) and a much greater cost in terms of network usage on the server (it increases by a factor K, where K is the number of users the average infected individual comes into contact with).

I feel any increase in security by forcing the attacker to be active is rather weak when compared against these trade-offs. Even a normal smartphone has a substantial range when using a high power setting (accessible without rooting) in our tests. Further, a malicious attacker broadcasting is indistinguishable from a normal user broadcasting, so the possibility of detection seems quite remote in practice.

a8x9 commented 4 years ago

Hi galadran, thanks a lot for your reply.

However, this comes at a considerable cost in terms of the reliability of the underlying Bluetooth protocol (both devices must receive each others broadcast, ...

I guess it depends on your definition of "considerable cost". If you consider unlikely that 2 devices spending several minutes, 2 meters apart, can see at least one broadcast from each other, then yes this proposal won't work. But then you also have to assume that design 1 detects at most 50% of contacts, and design 2 probably doesn't detect any contact.

But I've done some tests, and even using ADVERTISE_TX_POWER_LOW on a phone wrapped in aluminum foil, in a flat with ~ 15 Bluetooth devices, I can still reliably receive the phone advertisements at ~ 4 meters using a Bluefruit LE sniffer (which has a worse antenna than any phone). So I highly doubt that my previous paragraph's assumption is realistic.

Design 2 from the whitepaper requires a higher Bluetooth signal reliability than this proposal.

I feel any increase in security by forcing the attacker to be active is rather weak when compared against these trade-offs.

That's where we don't agree with each other. If we look at what kind of surveillance is already deployed, requiring an active vs a passive attack makes a huge difference.

Currently, Bluetooth sniffers are already deployed in shopping malls, train stations, and airports, among other places. It just so happens that all these places also heavily use cameras.

So here we are not speaking about a theoretical attacker possibly deanonymizing a few infected persons by broadcasting using a high power setting. We are speaking about infrastructure which already exists, is widely deployed, and can lead to infected people deanonymization on a massive scale.

So now, if we agree that the recording of Bluetooth traffic poses a significant threat to the anonymity of infected users, the only logical solution is to make the information published in case of infection, unlikable to the recorded traffic. An ECDH key exchange based protocol provides precisely this property.

galadran commented 4 years ago

Hi a8x9,

Can you clarify why Design 2 requires greater signal reliability? The Bluetooth layer is identical between Design 1 and Design 2. Or maybe you mean the Secret Sharing extension?

With regard to active and passive attacks. We have a different understanding of the deployed infrastructure and if I've misunderstood the situation, then it would be great to be corrected.

A lot of people have mentioned Bluetooth sniffers in public spaces with the apparent assumption that these devices are passively scooping up nearby transmissions. My understanding is that the opposite is true. They are typically broadcasting a signal which is picked up by the phone and recorded.

For examples, see the TON9108 and the i3. Consequently, given this infrastructure is already actively transmitting, we are already facing an active attacker.

Please correct me if I've misunderstood the situation and passive tracking solutions are actually more pervasive.

a8x9 commented 4 years ago

Can you clarify why Design 2 requires greater signal reliability? The Bluetooth layer is identical between Design 1 and Design 2. Or maybe you mean the Secret Sharing extension?

Yes sorry for the confusion, I meant the Secret Sharing extension.

With regard to active and passive attacks. We have a different understanding of the deployed infrastructure and if I've misunderstood the situation, then it would be great to be corrected.

A lot of people have mentioned Bluetooth sniffers in public spaces with the apparent assumption that these devices are passively scooping up nearby transmissions. My understanding is that the opposite is true. They are typically broadcasting a signal which is picked up by the phone and recorded.

For examples, see the TON9108 and the i3. Consequently, given this infrastructure is already actively transmitting, we are already facing an active attacker.

These device broadcast a fixed beacon and assume you have an app installed listening for a predefined UUID. They do not listen for nearby devices, they do not perform any computation and they do not store anything. So I don't think they qualify as an active attacker.

What people are speaking about are these type of devices installed in stores which have all-in-one cameras, BLE sniffing, WiFi sniffing, and yes they can also act as a beacon. The bluloc system deployed in airports has also already been mentioned in #43.

Now, these two example devices can also act as a beacon, so they could potentially be modified to perform an ECDH key exchange and store the result. But that would result in a change of the information broadcasted by these devices, which can be detected. It would also presume that the manufacturer of these devices would reprogram them specifically to target this proposed protocol.

In a threat model, you need to take into account the probability of a specific attack, and the consequences of this attack being successful. I'll let you determine if it's more likely that data already being silently collected would be used without any risk of detection, or if an attacker building / modifying a device to specifically target this protocol, with the risk of being detected, is more likely. Regarding the consequences of a successful attack, the security of an ECDH based protocol is simply downgraded to the security of the currently proposed solutions.

galadran commented 4 years ago

Hi a8x9,

Thank you for point me to the specific model, what a nightmarish world we live in. I don't suppose you know if these devices are legal / deployed in the EU? It would be an interesting test of the GDPR.

If we are willing to assume the device is going to go to the effort of recording all BT transmissions + associated video footage in order to identify infected users, I think we can assume they will turn on their Bluetooth beacon functionality as well and that the change would require very little effort. They could even use a static public key on their devices and defer all computation to a backend somewhere, no software or hardware changes would be required.

I agree that a naive deployment would be detectable in theory to technically savvy people walking around the shops but in practice, this is unlikely to be the case. You could for example only transmit the DH beacon near the shop entrance where a staff member is likely to be stood. As the EphIDs are necessarily static for small periods of time then the passive beacons can narrow witnessed the EphID to a particular user as they move around in the store. Even if they were caught red handed, they would immediately argue they only recorded this information for 'targeted cleaning' purposes or something similarly implausible but defensible.

tree-go commented 4 years ago

Agree with a8x9. By using ECDH, we at least have chance to detect and filter known malicious devices/public keys. And for infected users, they can review how many "encounters/ecdh results" they are going to upload to server, and have option to filter some or say no. And every user has right to check how many encounters in the restaurant, in the park, ....

questions to a8x9,

why only 128-ec not 256-ec? is it only to make public key in 1 adv packet or the security strength is good enough? and collision? (using 256bit is not a problem, but need 2 adv packets, right?)
we can also have a base key/or daily key to generate all private keys, right? (for an infected user, if there is no privacy concern, only upload base key could still be an option.)

a8x9 commented 4 years ago

Thank you guys for your responses.

@galadran

I don't suppose you know if these devices are legal / deployed in the EU? It would be an interesting test of the GDPR.

No sorry, I don't live in the EU, other people here probably know a lot more about what is currently being deployed.

@tree-go

why only 128-ec not 256-ec? is it only to make public key in 1 adv packet or the security strength is good enough? and collision? (using 256bit is not a problem, but need 2 adv packets, right?)

Yes it was to make the public key fit into a single advertisement packet. The FAQ mentioned that reassembling different packets was creating too much complexity. But it seems the opinion of the DP^3T team on this subject evolved given the new Secret Sharing addition. So it might indeed be a good idea to consider increasing the key size to 256 bits and split it in multiple packets. With the currently available API on iOS, you probably need to split it in 3 packets instead of 2, see my response here for the explanation.

EC 128 bits keys are "good enough" to resist against a bruteforce on a massive scale, but not good enough if you need to protect one specific key. Basically, a really well funded adversary can probably break one key per day.

we can also have a base key/or daily key to generate all private keys, right? (for an infected user, if there is no privacy concern, only upload base key could still be an option.)

That would really defeat the purpose of this scheme as you would be publishing the information necessary to reconstruct your broadcasted information.

But in case you have an alternative approach based on public key cryptography: yes you can derive all private keys from a seed. In fact, I think you can probably generate key pairs in such a way that you don't need to reveal your private keys to allow the reconstruction of your public keys.

Let's define:

G: curve generator
s: your initial private key (seed)
t: a large number by which you increment your private key on each epoch

Now on each epoch, you generate your key pairs the following way:

priv₀ = s
pub₀ = s * G
priv₁ = s + t
pub₁ = (s + t) * G = s * G + t * G
priv₂ = s + 2t
pub₂ = (s + 2t) * G = s * G + 2t * G
...

Then when you publish your 1st public key (s * G) and the value t. I haven't fully thought this through so it is very likely that there is a problem with this approach.

timoll commented 4 years ago

This is an excellent proposal. An active attack is much riskier and higher cost than a passive attack.

It is nearly impossible to detect a passive attack, especially if it is deployed with an existing infrastructure.

An active attack has the risk of getting detected. I assume there will be legal consequences if that happens.

a8x9 commented 4 years ago

I think the threat exposed by @muehlhoff in #222 seriously amplifies the relevance of passive attackers that I tried to convey in this issue.

I'm very interested to know what is the DP^3T team position on this design given this new information.

a8x9 commented 4 years ago

After doing some calculations using the proposal done by @timoll in #218, the amount of transfered data can be significantly decreased with minimal leakage of the contact graph to the backend*.

This means that the only drawback this proposal had compared to design 2 is now solved.

* Edit: after helpful feedback from @timoll and @lbarman, the caveat I initially made about a malicious backend tracking users by IP, was added back to the analysis. Under these conditions, the backend can determine when IP₁ requesting data is likely to be a frequent contact of the infected user who uploaded via IP₂.

sslHello commented 4 years ago

Use Ephemeral Key Exchange with ECDHE instead of (static) ECDH Please use open standard libraries to securely exchange Data and to securely store data. see Awareness: OWASP Top10: A3:2017 Sensitive_Data_Exposure. Use safe curves with at least 250 bits: https://safecurves.cr.yp.to/, e.g. Curve25519/Ed448 (RFC7748) or Brainpool Curves, see BSI TR-02102-1

timoll commented 4 years ago

@sslHello The advantage of static ECDH is, that you only need to broadcast a single key for multiple connections. This is a requirement for this system.

timoll commented 4 years ago

Is there a reason against only sharing 8 bytes to identify a contact?

64bit seems enough to avoid collisions.

sslHello commented 4 years ago

Hi @timoll: Is there solely boadcasting planned, without any acknowledegent on anything? Is the calculated shared session key for the two app users the contact key to be stored locally?

128bits could be so short that it gets vulnerable to even 'guess' the private key (with some contacts). For protocols like TLS, at least 250bits is todays best practice. I am sorry you need a real crypto guru, developping crypto algorithms to check this more detailed if your use case differs too much from general transport security.

Is there any DDOS protection to not overflow the local list with a big number of faked IDs? This to protect not getting false positive contacts later ...

timoll commented 4 years ago

The shared secret stored locally and only shared should a user test positive. The keys are changed every 15 minutes and breaking every key of a user would give you the same information as the current design 2.

There has been some discussion of increasing the key size, the problem is that this would require multiple broadcast as a single broadcast seems to be limited to 128bits.

leobago commented 4 years ago

I believe there is one more advantage to this approach: Interoperability. Using ECDH key exchange could make the protocol interoperable with other centralized approaches as the recent one published by the PRIVATICS team from INRIA. If I am not mistaken, both protocols would exchange ECDH keys and upload the necessary ones when an infection has been confirmed. The only difference remaining is that one protocol (DESIRE) would do the risk assessment in the server while the other would do it in the app. This could be a great advantage for touristic regions.

lbarman commented 4 years ago

Update: the difficulties of DH key exchange are mentioned in our analysis of DESIRE, section "Radio Interface".

timoll commented 4 years ago

@lbarman Doesn't that only apply to 256bit DHEC key exchange?

128bit isn't as secure but as @a8x9 pointed out, 128bit is still enough for this application and offers higher security than plain text (design 2)

sslHello commented 4 years ago

@lbarman with reference to your post about the analysis of DESIRE I've added #303 to find an existing standard procedure to generate the EphIDs.

peterboncz commented 4 years ago

Not sure if BLE purists would like it, but by removing the 4-bytes UUID section from the GApple advertisement BLE packet (letting it consist only of Flags and ServiceData), one could upgrade to a 160bits rolling identifier, yet stay inside a 32-bytes packet. This would allow to use secp160k1 instead of secp128r1. In my limited understanding, 64-bits encryption is brute-force attackable by a rich enemy in a day or so, but 80-bits should be quite hard (64K times harder?).

peterboncz commented 4 years ago

In https://github.com/DP-3T/documents/issues/218 the issue of the increased download requirements is discussed. While some of these are problematic given a malicious backend, there are some pragmatic solutions that could help (e.g. region partitioning).

@a8x9 suggested in that issue to use cuckoo filters to represent the set of V_AB keys that are uploaded; suggesting also that this saves 3x download volume. An interesting idea, hence.

My question: isn't it the case that representing the keys in a cuckoo filter obfuscates their exact value, which makes brute force attacks even harder? Maybe upgrading to 160bit is then not even required anymore.

timoll commented 4 years ago

@peterboncz

It is not necessary to share the full 128bit to identify a contact. 48bit is enough to make the probability of a collision very unlikely. Together with cuckoo filters, this would give a reduction of a factor of 5.

I also find that an average of 10 five minute contacts per 15 minutes is rather pessimistic. For most of the day, this will be 0 - 1. I wouldn't be surprised if the average detected contacts turn out to be 1-4, even with 100% adoption.

There will also be a relationship between the number of daily new cases and the number of contacts. The more cases there are, the stronger social distancing measures will be, reducing the number of contacts.

marcodermatt commented 4 years ago

Hi everybody,

I don't suppose you know if these devices are legal / deployed in the EU? It would be an interesting test of the GDPR.

I don't know about the legality of it, but I found this device: https://www.xovis.com/en/xovis-insights/detail/pc2r/

These are deployed in multiple stores of Alnatura, a subsidiary of Migros, as well as Coop, two major retailers in Switzerland. These types of devices are being deployed specifically during Covid to count the number of people in the stores. I will try to get in touch with them and see what version exactly they use. Maybe this is useful information.

DP-3T / documents

Protect infected users against Bluetooth monitoring: ECDH key exchange #66