Why are EPIDs only generated by server?

ArchanaaSK commented 4 years ago

What is the advantage of generating all the IDs and storing them at the server? What if these IDs were generated by the App and communicated to the server when needed?

For example, the App and server agree upon a server generated ID_A during initialization/registration phase. The server now has a ID list of unique IDs for each registered app. The app can generate EPIDs at the required interval( as mentioned in the specification but generated by the app). The proximity discovery phase will remain same with app distributing app generated EPIDs. In infected user discovery, the infected user app sends its LocalProximityLsit to server which is stored by server in a list of exposed EPIDs. In exposed status request phase, the server can collect user's generated EPIDs and match it with its list of exposed EPIDS and calculate the risk score. This is just a strawman protocol where the generation is done by app instead of server. Potentially, there will be many many registered users per server. This type of EPID generation may reduce the computation and storage load on server and the computation load on app is only increased minimally(to generating EPIDs) The storage size remains same for app side.

Victor-D commented 4 years ago

Same question. It's a technical issue (overload) but also a privacy issue (see also https://github.com/ROBERT-proximity-tracing/documents/issues/2).

ramsestom commented 4 years ago

You need the secret KS key to generate an EBID from a user ID so it can't be performed on the App. Anyway, in a centralized scheme like the ROBERT protocol, the server need to be able to link every EBID to the matching user ID to compute its risk status and correctly alert him. So even if the EBIDs where generated on the App, they would have to be sent back and stored in the server (or this one would have to be able to regenerate them on its own)

ArchanaaSK commented 4 years ago

Okay. But restricting the EPID generation to server brings up issues like issue #27

beng-git commented 4 years ago

I think this derives from the "centralized" approach / trust assumptions : if the EBID were generated on the app, then KS would be stored on the app and regularly EBIDs would be sent to the server, which would store them all. The benefit of doing this is that if the server is compromised, it is not possible to derive future EBIDs (this could make sense for a covert adversary who would then be able to impersonate anyone.

However, storing KS on the app means that the app/phone must also be trusted, in order to secure KS (in this case there would be a key for each phone I suppose).

More generally, this is the balance between models like DP3T and ROBERT : on the one hand you trust the central authority, which is probably more secure than an individual's phone, but if this trust is misplaced, you lose everything, whereas in DP3T you (arguably) would compromise a subset of individuals, but attacks are easier to achieve.

ramsestom commented 4 years ago

I think this derives from the "centralized" approach / trust assumptions : if the EBID were generated on the app, then KS would be stored on the app and regularly EBIDs would be sent to the server, which would store them all. The benefit of doing this is that if the server is compromised, it is not possible to derive future EBIDs (this could make sense for a covert adversary who would then be able to impersonate anyone.

However, storing KS on the app means that the app/phone must also be trusted, in order to secure KS (in this case there would be a key for each phone I suppose).

More generally, this is the balance between models like DP3T and ROBERT : on the one hand you trust the central authority, which is probably more secure than an individual's phone, but if this trust is misplaced, you lose everything, whereas in DP3T you (arguably) would compromise a subset of individuals, but attacks are easier to achieve.

It is not only a problem of trust, it is also a problem of server capacities as, if you decide to let the users generate their own EBIDs, the server would have to store them all (he would not be able to derive them from the user ID anymore). Which mean he would quickly need big storage (and memory for index) capacities to store the EBIDs to ID table (see https://github.com/ROBERT-proximity-tracing/documents/issues/8#issuecomment-616128430)

beng-git commented 4 years ago

The calculation you made in #8 leads to a few hundreds (?) of TB for a realistic case for a country like France with tens of millions of users of the app, I would argue that this is not a limit in the context of a centralized server (or cluster/cloud). Indexing might be more of an issue, but I do not measure the actual expected query load so it is difficult to answer on that point.

ramsestom commented 4 years ago

No for a country like France, considering 60 millions of users, the table would "only" be 2.5Tb . I also said in my comment that this option should be considered but an author of the protocol answered that it was rejected due to the size of the table it would requiere. So I think the question is already decided on their side.

ROBERT-proximity-tracing / documents

Why are EPIDs only generated by server? #19