Protecting Communication Metadata, Anonymous Communication Networks, etc

hiromipaw commented 4 years ago

In issue #39 my suggestion regarding mixnets or directly passing some of the connections through the Tor network was just a part of what I was trying to convey.

I think also considering delaying the activation of the app, or making this process async for the backend, could be a mitigation possibility. What I would try to avoid is 3rd party learning about the user health status and using that data.

For example it could be possible for the app to activate itself offline and start sending communication to the backend at a later point.

Furthermore, while the Tor network wouldn't possibly scale to serve all these users, in its current form, it would still be possible to just protect the first messages the app sends, and not the full traffic.

Originally posted by @hiromipaw in https://github.com/DP-3T/documents/issues/39#issuecomment-615834400

kingflurkel commented 4 years ago

? https://swarm.ethereum.org/

lbarman commented 4 years ago

Ah. Well I answered you over there. Self-quote:

hi @hiromipaw; thanks for your message.

What I would try to avoid is 3rd party learning about the user health status and using that data.

This is crucial, we agree!

For example it could be possible for the app to activate itself offline and start sending communication to the backend at a later point.

Yes, why not. The "problem" with specifying this is that this [activation] part will likely be country-specific (each having a specific way of contacting/authenticating their health authorities, country X wanting a QR-code while country Y wants a phone call with a doctor). Without concrete details, it's hard to tell whether Tor/Mixnets/delaying/padding/chaff traffic would help or not. Otherwise we could say "send exactly these messages of these sizes in that order" to avoid leaking information.

But we are aware of the problem, one option would be to propose one recommended way to do this activation/upload.

edit: hence our general recommendation that non-infected users also regularly upload dummy data (with an invalid authorization from the medical authority)

Thanks! Happy to discuss more if you have other inputs

Since we've been discussing this over many issues, let's make a reference of this one. Current suggested solution:

All users periodically upload dummy data (with invalid authorization) to the server
Server only updates the public data once a day (it batches the received information into 1 update per day)

edit: formatting edit2: as mentioned by @ryanbnl , uploading through a trusted party (e.g., via the hospital's trusted WiFi) also works

ryanbnl commented 4 years ago

Or have the application of the patient authorise the application of the HCW to make the report. The HCW already knows that the patient is infected. So long as the only information shared is anonymous then the only thing you can tell is which HCW made the report. That reduces the set of people who could be an ID, sure, but you get better results by taking pictures of people being tested and running them through a face recognition algorithm.

ryanbnl commented 4 years ago

So not specifically via the wifi from a trusted party but the actual app of a trusted party. You still have the mac address and other fingerprint info that is shared by a device (depending on how exactly you're communicating).

FishmanL commented 4 years ago

Yeah, batching data by HCW solves these problems too

ryanbnl commented 4 years ago

The workflow here seems (The Netherlands) to be that people are being reported as infected via the telephone. So exposing the IP address is unavoidable.

To put this into context - compared to the privacy issues in other projects this is like having snowflake on your your windscreen whilst the others are covered in snow.

hiromipaw commented 4 years ago

One thing that could be considered is this: the trusted-party or proxy upload can be configured as a module that the single implementation can tune. In this scenario there can be different defaults that can be considered. For example:

The trusted proxy default. Ex: the health ministry of a country can use its infrastructure that is usually also used for other requests. In Spain for example you can book appointments with the local health care center in your neighborhood online. In this example booking an appointment with your health practitioner would generate a similar request.
Data are batched, or all clients do the same requests regardless. This is more a matter of network metadata and how clients interact with the backend.

Both defaults could be implemented, or only one of them could be chosen by the single country/region implementation.

hitd010000 commented 4 years ago

* Server only updates the public data once a day (it batches the received information into 1 update per day)

Additional advantage: Pull requests could be done during night. In a lot of cases, cellphones are connected via WLAN using a flat rate, therefore no additional costs to the people. And energy saving does not matter, because cellphone is often connected to a power supply during night.

marado commented 4 years ago

Shouldn't this issue be tagged as a "privacy risk"?

DP-3T / documents

Protecting Communication Metadata, Anonymous Communication Networks, etc #193