DP-3T / documents

Decentralized Privacy-Preserving Proximity Tracing -- Documents
2.25k stars 180 forks source link

Epidemiological "days" and international travel, and daylight saving time. #182

Open cluck opened 4 years ago

cluck commented 4 years ago

Epidemiologists seem to do their considerations based on a "daily" timescale, effectively intending sequences of 24h timeslots. On the other hand, the current design requires that proximity records are recorded in daily batches, without further specifying what happens when the app crosses timezones, or daylight saving time enters into effect.

This leads to a discrepancy when a local healthcare agent (or the app) will "count backwards" to determine which past "daily batch" to publish (in it's entirety).

The design probably needs to be changed to batch by day and timezone (including daylight saving time zones), instead of just by day.

lbarman commented 4 years ago

Hi @cluck, thanks for the input. To be fair current discussions are not yet about a global deployment of one solution; more like one deployment at the scale of a country (were we wrongly implicitely assumed one timezone). Thanks for the suggestion, will keep track of this.

kugelfish42 commented 4 years ago

The protocol does not depend on civil time and only requires that all participants use an arbitrary but same convention, which can as well be UTC and should be explicitly stated. Even in the local timezone now() - n * 24h does not align to full days. Timezones are a hot political mess and make accurate time computations unnecessarily complex.

cluck commented 4 years ago

Oh, I don't agree. I think it actually depends a lot on civil time, as the association of contact traces to "daily bins" is made earlier than it would be with any non-privacy-preserving design (and all other time information is lost with it), but thus it precludes the later computation of which point in time and space relates to the local notion of "day" that the risk model will have.

Note that the data is stored "today", for interpretation of a risk model which is determined in two weeks from "today". Such a risk model may rely on "24h cycles", or "days".

noci2012 commented 4 years ago

If a mapping to civil time needs to be done use UTC as the provider for clock times. ANY presentation of the UTC time will need to be translated to a localtime anyway. (Most Unix system already use UTS as system time, only the external presentation is done using timezones..). days are globaly 24 hours, so N days is N*24hrs back in time from a current timestamp. (UTC based) then you can calculate back to the "presentation time" to be shown...

Using time this way there is no way time can be mis interpreted. ... If a safety factor is needed the N could be made into N+1 (one more day back in time).

In 99% of the cases this point is moot in this current time of social distancing. And Staying at home.

henriterhofte commented 4 years ago

I was wondering, what woud go wrong if we globally just used UTC dates, counting how many entire days elapsed in UTC time sinds the UTC start date of the system? Looked at all occurrences of 'day', 'date', 'time' in the specs and say only one case where day comes up in interaction with the user: limiting the amount of EphIDs to be uploaded to the server, based on an estimate of the contagious window, based on a (calendar) day the user indicated to have experienced the onset of symptoms. Since both remembering the onset of symptoms as well as the estimate of epidemiologists of the duration of the contagious window befor the onset of sympoms is imprecise, it might be better to ignore the difference between the real date and UTC date and send one day extra.

dl5rcw commented 4 years ago

@henriterhofte when it comes to scale, simplification matters in terms of energy consumption in calculations / number crunching. UTC makes things easy and the extra day will simplify instead of datetime calculations. Downturn might be: if user is informed because it is 15 instead of 14 days of contact, it needs to be reflected in the probability of infection that might have been occured.

It much depends on how you calculate individual likelyhood at the end. But I like how you think about simplification of algorithm because of energy consumption and operational cost when it comes to scale

cluck commented 4 years ago

Some of you have suggested UTC. But I still fail to see how this helps, because of the size of the buckets. The contacts are to be assigned to daily buckets the moment they occur, and any other time information is dropped for privacy reasons. With an implicit standard timezone like UTC, anyone living in any other timezone will see their buckets switched in the middle of their day. To publish their "yesterday" would need them to publish 2 buckets, either "UTC yesterday" and "UTC today" or "UTC two-days-ago" and "UTC yesterday". For a safe bet facing international travel, even all three(!) of the buckets would need to be considered. This leads to a lot of false positives, and defeats the existence of the app altogether.

Technically, this is a samplingrate/aliasing problem, introduced though bucketing, and the engineer's solution would be to increase the granularity: e.g. use hourly buckets instead of daily buckets, or just stop bucketing the contacts at all, keeping exact timestamps.

But, for privacy reasons, unnecessarily precise data should be avoided. That's why these buckets have been introduced in the first place. As far as I see it, we can go away easily with maintaining a daily bucket per every timezone encountered.

Note that we are talking about an app which leads to a heavyweight interference with human rights. Its only right to exist is to track down the one-day window of symptom-less infectiousness. Simple approximations in time quickly become false results, giving way to false accusations, arbitrary decisions, personal and economical losses, constriction, etc. ... i.e. political abominations.

Vanuan commented 4 years ago

I think part of the problem here is that the reporting mobile device and the server need to be aligned on the exact moment of time when it is safe to publish the daily key (after isolation):

Another part of the problem is that the receiving mobile device need to know the exact moment of time when the key was made public to ignore handshakes made after that moment.

Vanuan commented 4 years ago

Let's say we choose 00:00:00 UTC as a moment of key publication. Clearly, this should be configurable (probably through the discovery service or the backend API itself).

Here's what should happen:

  1. The backend should ensure that only keys reported before the 00:00:00 UTC moment are published in a daily bucked of the next day.
  2. The reporting mobile device should report daily keys from the contagion onset to the isolation date, but ensure that the isolation date is before the moment of publication (00:00:00 UTC).
  3. The receiving mobile device should ensure to ignore handshakes of all the derived keys published in a corresponding secret key bucket that are made after the moment bucket was published.

Now, the international aspect. As far as I understood, the assumption is that each backend has a fixed timezone. That is, there are no "international" backends. It means that each backend has a fixed publication moment of each bucket. So the problem only happens when the app uses multiple backends. Or when one backend distributes secret keys to another backend (in another timezone).

Interoperability between multiple backends or between one app and multiple backends is not determined yet. As there's no regulation of what should happen when the app from one country communicates with an app from another country.