patcg-individual-drafts / ipa

Interoperable Private Attribution (IPA) - A Private Measurement Proposal
Other
36 stars 17 forks source link

"Gathering Events" section is very confusing #46

Open MattMenke2 opened 1 year ago

MattMenke2 commented 1 year ago

The first sentence "collecting information on important events". It's unclear what events these are (source events, trigger events, both, something else?) and where information is being collected - is this a page's JS collecting source (trigger?) event information and passing it to the browser, the browser collecting this information (provided by sites) in a database, or this information being collected by a sever after reports have been triggered? The sentence does being with "IPA begins", but I don't know what the authors considered the "start" of IPA - is it the start of data being actually aggregated by a server, the start of one or both types of events, or what?

The section also mentions collecting information, collecting events, and collecting reports. It's not clear if these are one, two, or three distinct concepts.

bmcase commented 1 year ago

@MattMenke2, thanks for the feedback, we’ll try to make that section more clear. To your questions, IPA really begins before gathering reports when a Match Key Provider (MKP) sets a matchkey on the user agent. For gathering reports there are three steps:

  1. Getting the encrypted matchkey from the browser. When an event happens, the page can call the get_encrypted_matchkey() API and get back encrypted secret shares of the matchkey. This is done for both source and trigger events.
  2. Adding data to the report. Data known to the page about this event (timestamp, a bit indicating source/trigger, a conversion value for trigger reports, a breakdown key for source reports) needs to be added to the report. Most of this data also gets secret shared and encrypted under the public keys of the Helper Parties (timestamp remains in the clear as later we want a report collector to sort batches of reports by timestamp in the clear before submitting as a query). The computation for adding this data to a report could happen on the website's server or be implemented in JS like a pixel.
  3. Sharing reports with other report collectors. Once source and trigger sites have generated reports, they need to exchange them so each can use them in their own source/trigger fan-out queries.

The one other path than can happen instead of 1) for generating an encrypted matchkey is when a MKP creates reports for the events that happened on its own site. If a MKP knows who a user is, it could just create a report with the user's matchkey entirely serve-side and not need to call get_encrypted_matchkey().

Let me know if that still doesn't answer any of your questions.

bmcase commented 1 year ago

For more details on how get_encrypted_matchkey() might be implemented for browsers, see issue #25