csharrison / aggregate-reporting-api

Aggregate Reporting API
41 stars 10 forks source link

Temporal information in the reports #5

Open lalbrizzo opened 4 years ago

lalbrizzo commented 4 years ago

Hi @csharrison and thanks for this proposal. I was wondering if, with the current state of this proposal, any aggregated information about the "temporal aspect" of the browsing history the users will be transmitted via the API (eg if the user visited in the last month certain website more than the average).
This is of particular importance for certain ML models that need information about the browsing history, obtained today via 3rd party cookies, to dynamically predict bids for each user. Thanks in advance, Luca

csharrison commented 4 years ago

(Edit: the following reply is mistaken, I was referring to temporal information in https://github.com/csharrison/conversion-measurement-api)

Hey Luca, The event-level API does transmit some temporal information in a coarse grained way. For instance, you will learn:

I will need more information about what information you need about browsing history, since this API doesn't transmit much extra information about browsing history at all.

lalbrizzo commented 4 years ago

Hey Charlie, thanks for the information.

Some ML models (or in general some bidding strategy) see the conversion as a temporal series and try to find the optimal browsing path to reach a conversion. These models need information about which websites the users visited in the past to decide if and how to bid. Today this is achieved by tracking (almost) every visit for (almost) every user and, as far as I understand, this would be impossible without third party cookies but it might be possible if the API sent the aggregated information about previously visited websites (maybe in some aggregated fashion).

Are you planning to do any of this in this framework or this is somehow against the general philosophy of the privacy sandbox?

Thanks

csharrison commented 4 years ago

I see, thanks for the clarification. Sorry in my previous reply I was thinking you were referring to https://github.com/csharrison/conversion-measurement-api, not this repo. My mistake!

It is not against the general philosophy of the privacy sandbox to reveal aggregate information about a group of users, as long as an individual's data are protected. This could potentially be useful if you have a means of training an ML model with aggregate data, but this API won't allow you to explicitly join that aggregate data with e.g. user IDs. You might be able to get coarse information like "users in the US visit x,y,z domains more than users in canada".

It sounds like you are interested in serving targeted ads, have you looked at the TURTLEDOVE / FLoC proposals?

lalbrizzo commented 4 years ago

I have definitively given a look to TURTLEDOVE and FLoC and from there I have landed on this repo :) I think too that some ML models might have to change to use aggregated data but I am happy to hear that coarse-grained information about the browsing history might be available. Thanks for the clarification!