csharrison / aggregate-reporting-api

Aggregate Reporting API
41 stars 10 forks source link

Rapid Aggregation Proposal #13

Open appascoe opened 4 years ago

appascoe commented 4 years ago

Use Case

Accurate campaign pacing is a common concern for the Aggregate Reporting API. As stated during the W3C web-adv working group, we've observed that a delay of 24 hours in feedback can result in a decline of ~10-11% spend on publishers in our experimental group, and that also includes overspend safety mechanisms, which do not appear readily available for implementation under the Aggregate Reporting API proposal.

Perhaps this delay of 24 hours can be adjusted to aid in this, but this is a complementary proposal that may also help.

Proposal

The fundamental issue is maintaining some level of differential privacy before reporting. For more sophisticated reporting, we of course want to be able to do things such as:

const entryHandle = window.writeOnlyReport.get('campaign-123');
entryHandle.set('country', 'usa');
entryHandle.append(“visits”, “1”);
...
// any number of other dimensions of interest

But most of these data are not particularly relevant for pacing considerations. I propose a more limited API call:

const entryHandle = window.writeOnlyRapidReport.get('campaign-123');
// only cost/spend entries permitted

From here, the report could be subjected to a much, much shorter delay as differential privacy is likely to be achieved that much sooner. Additionally, because the data aggregates more quickly, we could use less noise/quantization in the reporting to help pace more accurately (or, conversely, if we wanted to increase the speed of reporting back, the noise could be increased). Perhaps it's worthwhile considering an explicit API function for incrementing cost values to enforce it's the only mechanism to write things into this rapid report.

Essentially, I'm arguing that there are two distinct purposes for reporting:

Given these distinct use cases, I think it makes sense to have separate API calls that reflect them.

csharrison commented 4 years ago

Hey Andrew, thanks for the feedback. I definitely think a lot of this proposal is stale in light of the recent work on the aggregation service design: https://github.com/WICG/conversion-measurement-api/blob/master/SERVICE.md

You could definitely imagine a new API integrating with the aggregation service that has shorter delays. I don't necessarily think reducing the information in the report is even necessary, since it should be on the caller to create groups large enough that the output is not drowned out by noise.

I agree that there are potentially some classes of reports that prefer to get data sooner for pacing reasons.

dmarti commented 3 years ago

Another possible use case for this proposal is QA. A variety of things could break in production, some of which could be hard to detect in advance.

Site administrators need to be able to check their reported metrics with minimal delays to be able to roll back broken versions or to detect problems and deploy workarounds.