TCNCoalition / TCN

Specification and reference implementation of the TCN Protocol for decentralized, privacy-preserving contact tracing.
MIT License
263 stars 33 forks source link

Make a conceptual map of the Cambrian Explosion of contact tracing systems. #61

Open hdevalence opened 4 years ago

hdevalence commented 4 years ago

Now that there's a semi-stable 0.4 protocol for apps to use while they iterate, I think it's worth trying to focus on what an 0.5 protocol would look like (#56), and I think a good first step to doing that would be doing a survey of the Cambrian explosion of contact tracing protocols and trying to map out the high-level building blocks of each proposal. Having mapped out all of the building blocks, we could try to understand each proposal as combining those building blocks in different ways, and try to synthesize a protocol that combines the best ideas from all of the existing protocols.

For instance, one basic building block could be broadcasting pseudorandom identifiers. Another could be the idea of deriving multiple pseudorandom identifiers from some secret, as is done in TCN 0.4 and the AG protocol. These use different mechanisms, but they have the same goal. Can we write this goal explicitly, and compare the mechanisms for each part? What are the benefits and costs? Another idea is the way that TCN has clients prove in zero knowledge that they generated the identifiers they report. Can we express this goal independently of the mechanism and describe its benefits to compare against its costs? Or, the latest DP3T proposal uses Bloom filters, and there are other ideas in the PACT-E and PACT-W proposals, etc.

After having mapped out these conceptual building blocks independently of the specific mechanisms, we can then express each proposal in terms of composition of a common set of blocks (though the mechanism for each of these blocks may be different for each proposal). And we can then try to create a hybrid of the proposals, first selecting one or more combinations of blocks, and then selecting the best mechanism for each one.

hdevalence commented 4 years ago

The first step to doing this would be to create a building_blocks.md with rough notes towards this goal. We can refine the notes later.

degregat commented 4 years ago

70 has a rought first draft

Vanuan commented 4 years ago

I'd structure it this way:

But to effectively compare them, we need a common glossary. At least a correspondence table between different terminologies.

After that, there is additional differentiation:

degregat commented 4 years ago

@Vanuan good idea! I started a glossary in #74

Vanuan commented 4 years ago

I've created a similar issue in DP3T: https://github.com/DP-3T/documents/issues/227

dirkx commented 4 years ago

I am trying to compare the designs on where they are different (skipping over the similarities).

Some first thoughts here - before I summarize and add to that ticket #61

Stuffing/Smearing

Collusion prevention (Higher N / day cycle - e.g to 60-15 mins range)

Additional (match/clinical) metadata (skipping over de-anonymisation risks)

Notes: *: complexity here/what helps is that the viability of the app assumes a relatively small number of infected 'hits' per `mobile'. Because once infected a person is 'out of the pool that needs to be told they are infected for M days'. So it is fair to assume (fractions of %) of the population are told each day that they are infected. : Given , it is fair, say for the metadata to do a CF check (or a design 1 check) and, for the 1% of the phones it matches, fetch the 'extra info' file for that hit - or, to make cross correlation harder, fetch all extra-metadata that have the same first 8 bits of a hash of the match. (edited) : depending if one is willing to reveal if N keys are all correlated to the same person (completely reveal or hash level reveal some 10+ bits of that).

API - can we make an API (for app developers and mock) that lets people implement clients/mobile apps while not having to pick a protocol yet.

Issues 1) metadata, 2) geo-region extra data.

Vanuan commented 4 years ago

Collusion

or collision?

Vanuan commented 4 years ago

-- Seed is the 'proof'; after disclosure it therefore needs to be re-created.

Maybe clients should ignore any ephemeral ids encountered after a secret key was published? Since anybody can fake them after disclosure.

dirkx commented 4 years ago

Yes. very good point. That is not explicitly in any of the papers. it should be. as it is easy to miss. Raised https://github.com/DP-3T/documents/issues/228.

dirkx commented 4 years ago

yes - it works both ways. the client should recycle - and clients should ignore during hte 14 days that they may learn of that seed post infection moment.

degregat commented 4 years ago

Additional discussion about a glossary in #74