WICG / attribution-reporting-api

Attribution Reporting API
https://wicg.github.io/attribution-reporting-api/
Other
367 stars 173 forks source link

Specification is inconsistent with regard to handling of non-monotonic time #1405

Open apasel422 opened 2 months ago

apasel422 commented 2 months ago

At a high level, the specification uses the local time for two purposes:

  1. To order events relative to each other. For example, when prioritizing two sources during attribution that have the same priority value, the source time is used as a tiebreaker.
  2. To schedule events to happen at a specific local time in the future. For example, event-level reports are sent at the end of a particular report window, expressed as an offset from the source time.

However, the local time, as the specification uses it, is not necessarily monotonically increasing. From the High Resolution Time specification:

When using moments from the wall clock, be sure that your design accounts for situations when the user adjusts their clock either forward or backward.

Given that, consider the following scenario:

  1. A source S1 is stored at time 20 with priority 200.
  2. The user adjusts their clock back to time 10.
  3. A source S2 is stored at time 15 with priority 150.
  4. A trigger T1 is stored at time 16.

This has two strange implications:

  1. The source times of S1 and S2 are not consistent with the logical order in which they were applied.
  2. The trigger time of T1 is earlier than the source time of the source S1 that it was attributed to.

Source- and trigger-processing have many side effects: Source deactivation/deletion, report generation, report replacement, rate-limiting increments, etc. Therefore, in my opinion, it doesn't really make sense to allow sources and triggers to perform the kind of time-traveling that they can do today: There's no real way to undo (partially or completely) the results of a previous registration R1 given a new registration R2 whose time indicates that it should have happened before R1. And should it even be possible for a trigger to be attributed to a source that apparently happened after it?

There's a tradeoff between respecting the user's intentions regarding deliberately adjusted clocks (which may or not have been done with Attribution Reporting in mind) and specifying and implementing consistent and understandable behavior. If we were really respecting the user's intentions, then perhaps S1 in the above scenario shouldn't even be "visible" to T1, but this quickly becomes difficult to deal with.

In the specification today, this time-traveling possibility is for the most part ignored, leading to a mixture of intentional and unintentional "undefined behavior" in some situations.

Where possible, it would be simpler to avoid relying on time at all for relative ordering, and instead introduce guaranteed monotonically increasing IDs for sources and triggers, which is what the Chromium implementation does in some (but not all) circumstances today to address the time-traveling problem. These IDs would reflect the actual order in which events occurred, regardless of the local notion of time.

Where this becomes tricky is in how it interacts with scheduling (rate-limit expiry, report transmission, source expiry), which requires some notion of time in order to work at all. For example, trigger-processing entails checking rate limits: Would it be better to have the prioritization tiebreaker use increasing IDs even if the rate-limit checks still use time? Or would that be even more confusing than the current situation?