DP-3T / reference_implementation

Apache License 2.0
127 stars 44 forks source link

check_infected also considers SK keys that were never used by the infected user #12

Open LCBH opened 4 years ago

LCBH commented 4 years ago

Because of https://github.com/DP-3T/reference_implementation/blob/3295ed7ddcf6c792d3e21cda19522a1302d8b015/LowCostDP3T.py#L243 https://github.com/DP-3T/reference_implementation/blob/3295ed7ddcf6c792d3e21cda19522a1302d8b015/LowCostDP3T.py#L245 check_infected will compute all the SKs between the first infectious period (argument date) and the current time. To avoid trivial false positives triggered by a malicious agent, one should limit the computation to the days between the start of the infectious period and the day before SK_t has been published. See https://github.com/DP-3T/documents/blob/master/DP3T%20White%20Paper.pdf page 11.

LCBH commented 4 years ago

#7937bc9 exemplifies the problem. Alice was considered at risk before this PR.

rkunnema commented 4 years ago

Hi! Alternatively, the backend witholds SK_t until the next day. This way, contacts on the day the infection was diagnosed can be alerted.

LCBH commented 4 years ago

I don't think this would totally solve the problem. Consider for example the following scenario:

The PR #13 should fix this.

rkunnema commented 4 years ago

SK{n+1} would be out of range, though (more than 14 days). For smaller intervals, this could be enforced by basically the change in PR #13 (if I read it correctly, the diff is super long) or a more hacky solution (Phone takes current day - 1 when it receives SK{n-14}). This is in addition to witholding the key and securing the channel from HA to Backend.

I think capturing contacts on day n may be worth it -- when someone goes to the doctor, they may have symptoms that facilitate spreading the disease, like coughing.

LCBH commented 4 years ago

The current implementation does not check that S_{k+1} is out of range. (The specification does, but it also uses the publication date as done in #13.

Yes the PR also contains some sanitization, which makes it hard to read as a whole. The commit #548e8ab actually fixes the issue and the diff is much smaller.

The specification/design 1 does not capture contacts on day n to avoid false positives as anyone would then be able to compute valid, infected EPhIDs. I guess, one would have to rotate the key more often to have it all, or to not randomize the list of EphIDs and include the epoch of the SK publication (but this also come with a slight privacy loss).

rkunnema commented 4 years ago

Ah, now I see. (For the record: there is no use in securing the channel HA -> Backend as the Backend broadcasts the <sk2,t2> tuple anyway)