Questions about the explainer on behalf of an ad server

ksikka commented 3 years ago

I'm trying to envision how an established ad server could utilize shared storage to implement frequency capping, a/b testing, remarketing, and rotating creatives in a sequence at the user-level. A few questions:

Is the limit of 5 URLs set in stone? For campaigns that select a creative based on a combination of frequency-capping, a/b testing, and remarketing, we can easily run into the 5 URL limit. One of these 5 URLs will be used as a default in case the user does not fit into any of the specified categories. We would like the maximum flexibility possible, while preserving the user’s privacy.
I found this part of the explainer hard to understand:

However, a leak of up to log(n) bits (where n is the size of the list) is possible when the fenced frame is clicked, as a navigation that embeds the selected URL may occur.

Can you describe the idea further?

Will we be getting temporary “Event-level reporting” that FLEDGE currently has (e.g. https://github.com/WICG/turtledove/blob/main/FLEDGE.md#5-event-level-reporting-for-now)?

ksikka commented 3 years ago

Bump

ksikka commented 2 years ago

Hi, should I expect an answer to these questions

jkarlin commented 2 years ago

My apologies for the slow response.

Is the limit of 5 URLs set in stone? For campaigns that select a creative based on a combination of frequency-capping, a/b testing, and remarketing, we can easily run into the 5 URL limit. One of these 5 URLs will be used as a default in case the user does not fit into any of the specified categories. We would like the maximum flexibility possible, while preserving the user’s privacy.

Not set in stone. We'd like to better understand what a useful limit would look like. We need to balance that with the fact that the larger the number is, the more information gets leaked. So it's a utility/privacy trade-off. I think we'd probably wind up with something under 10.

However, a leak of up to log(n) bits (where n is the size of the list) is possible when the fenced frame is clicked, as a navigation that embeds the selected URL may occur. Can you describe the idea further?

Sure. So the goal of an attacker here is to exfiltrate some data from their shared storage. Let's say the attacker owns the script on the publisher page that creates the shared storage worklet. And it asks the worklet to choose between 8 urls: (attacker.com/publisher_site/publisher_user_id/000, attacker.com/publisher_site/publisher_user_id/001, attacker.com/publisher_site/publisher_user_id/010, attacker.com/publisher_site/publisher_user_id/011, attacker.com/publisher_site/publisher_user_id/100, attacker.com/publisher_site/publisher_user_id/101, attacker.com/publisher_site/publisher_user_id/110, attacker.com/publisher_site/publisher_user_id/111)

The worklet chooses the url that matches the 3 bits that it wishes to exfiltrate. The chosen url is now in what we call an opaque url that can't be read by the publisher page, and can only be loaded in a fenced frame. The fenced frame loads the url, and can't communicate the information it knows. But when clicked, the fenced frame can navigate. Let's say it navigates to : attacker.com/landing_page/publisher_site/publisher_user_id/110. Now attacker.com knows that the user publisher_user_id on publisher_site has value 110 for some cross-site value.

ksikka commented 2 years ago

Thanks Josh, when would we be able to test the shared storage api?

jkarlin commented 2 years ago

Unfortunately we don't have a solid timeline for when it will be available. When we do, it'll be posted on https://www.privacysandbox.com/timeline/

pythagoraskitty commented 1 year ago

The API has been in testing now for some months, as will have been aware if you have followed https://www.privacysandbox.com/timeline/.

Closing for now. Please re-open or open a new issue if you need further clarification.

WICG / shared-storage

Questions about the explainer on behalf of an ad server #13