Open alexeiverbny opened 5 months ago
We discussed a bunch of options in #431. The way that seems the most immune to any kind of corruption is to include a key called "time"
in every Interest Group, and have your KV server return the current time in your trusted bidding signals. You can also include a "time"
value in your perBuyerSignals, and you can compare the two to see if either one looks out of date.
Ah I will take a look at the issue and consider this approach. Thank you
Does google plan on supporting a time
key in their KV server once it is ready? Without the user needing to push time
values?
Ah, interesting question! I don't recall any discussion of our KV server implementation handling special keys like that. Hey @peiwenhu has this come up before?
I believe it could be implemented using UDFs even if it's not built in.
Thanks. I am interested in what Peiwen has to say but it does seem like the UDF route would work for us.
Hi @alexeiverbny, one more question for you based on a conversation in a recent WICG call.
Is there any chance that what you're seeing is the result of a component auction configuration that is being set once and then used multiple times? I wonder if there's a risk of this happening when the publisher page refreshes an ad slot.
I found a reference to this in the original GAM testing plans, with a paragraph about "update their previously provided auction configuration", and also a hint at a mechanism in the GPT documentation, where it says "If this value is set to null
, any existing configuration for the specified configKey
will be deleted."
None of this is specifically browser stuff; this feels more in the territory of the interaction between GAM and Prebid, which I'm not very knowledgeable about. Maybe @patmmccann can offer us some insight on how refreshes are supposed to work.
In any case, Alex, I do think the timestamp stuff that we discussed will help you avoid bidding in this situation. But surely it would be preferable for everyone if the situation never came up in the first place!
Hi @michaelkleber,
Are you hypothesizing that auctionConfig
is not being set to NULL when a publisher page refreshes? I think that is a possible explanation. We dug into one particular example where we had stale perBuyerSignals and noticed that it was a site with frequent ad refreshes. auctionConfig
not refreshing seems like a possible explanation.
The "timestamp from kv-server" will help us no bid in this situation, but yes, I agree that it would be preferable if this never came up. Both buyers and sellers would benefit from a refreshed config during ad refreshes.
Thanks, Alex
IIUC, this PR makes it so configs are not re-used by default: https://github.com/prebid/Prebid.js/pull/10930.
However, we do still see stale perBuyerSignals
From reading that PR, it does sound like the old behavior was for configs to get reused across ad slot refreshes and the new behavior was to not reuse. But I don't know how to watch such behavior changes rolling out — seems like you would need to know what version of prebid each publisher has on their page?
So maybe what you're seeing is just a temporary thing that will be solved once all sites are using a later version of prebid. I'm sorry I can't really help with this, other than making some guesses about what's going on.
https://github.com/prebid/Prebid.js/pull/10930 was included in Prebid.js release 8.37.0 and after. It will take time for publishers to update their sites.
Thanks, @laurb9 .
Hey sorry I'm a bit late to this discussion. Returning time or any special key has not been discussed before on our side.
There are 2 concerns:
I think the time is one thing that we could safely exclude from the cache key, if we opted to include it in requests.
That aside, if the TEE wanted the time, couldn't the (untrusted) server that wraps the TEE provide the time? Having a putative web standard provide the time from the client, just because a particular implementation of a TEE can't provide an accurate time seems a bit strange to me.
I am not familiar with how today's HTTP caching works. With the current plan, how will we be able to get an uncached response inside generate_bid() every time? This is important for us not just for time, but for all keys that we return from the kv store
I think the time is one thing that we could safely exclude from the cache key, if we opted to include it in requests.
So IIUC the cache would be keyed by all keys in the request except time (plus other dimensions)? And at request time the browser gets 1 cached entry for all other keys and makes 1 request for the time key to the KV and merges the cached entry and the fresh response into one response?
That aside, if the TEE wanted the time, couldn't the (untrusted) server that wraps the TEE provide the time? Having a putative web standard provide the time from the client, just because a particular implementation of a TEE can't provide an accurate time seems a bit strange to me.
What you have in mind here is what discussed above of using user defined functions to provide the time. Yes that might work. I was just explaining that other alternatives would be hard because the trusted part of the server logic can't provide the time because the time cannot be trusted and the server (or our team) doesn't want to provide something untrusted as part of its APIs since the recipients may think it's trusted. In this case the recipient (generateBid) isn't inside the trust boundary so it's not a big deal but tomorrow the time may be used by something within the trust boundary, and at that point it can be confusing.
I am not familiar with how today's HTTP caching works. With the current plan, how will we be able to get an uncached response inside generate_bid() every time? This is important for us not just for time, but for all keys that we return from the kv store
I think today you need to set the cache control header to tell the browser TTL of your key value data. For TEE KV server it'd also require the server to set the TTL somehow in the response.
Thanks Peiwen. I am reading up a bit on cache control headers https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control. So would we be able to set Cache-Control: no-cache
in the response header of the TEE KV?
Thanks Peiwen. I am reading up a bit on cache control headers https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control. So would we be able to set
Cache-Control: no-cache
in the response header of the TEE KV?
The cache control headers I think are what impacts the BYOS KV server, if that's what you're using today.
For TEE KV server, it'd not use this exact header but it'll provide a way.
Note that the KV server uses an extra layer of encryption for the request/response bodies, so it can't use HTTP caching semantics. BYOS uses query params and only uses HTTPS, so it can rely on the standard HTTP caching semantics.
Having the browser add the auction start timestamp with an acceptable resolution to browserSignals
is more straightforward in my opinion. It sidesteps the Date shimming issue, works with no KV and exposes the same information as the KV reflecting client timestamps back to the client would, with more privacy control even.
There is already join recency in generate bid, in milliseconds, it seems. So "hiding timestamp" is rather strange. Everyone can also include timestamp in ig-join, inside ig, just before calling join API. Then maybe sum join timestamp and join recency.
Set browserSignals["[recency](https://wicg.github.io/turtledove/#dom-biddingbrowsersignals-recency)"] to the [current wall time](https://w3c.github.io/hr-time/#dfn-current-wall-time) minus ig’s [join time](https://wicg.github.io/turtledove/#interest-group-join-time), in milliseconds.
It is a very ugly hack, and perhaps the worst way possible to get current time. Very bug-prone. But if someone is willing to do whatever it takes to violate privacy they probably will do this. Everyone else should not need to try to debug a sum of timestamp and recency, especially that logging anything from generate bid is rather tricky. KV store bouncing time back is actually even more complicated than this hack.
It's not access to any timestamp that we need to block, but access to a continuous high precision timer, to protect against side channels based on CPU usage. It's a high precision continuously updating time that we're concerned about, not knowing the general time an auction occurs.
Recency is rounded to 100 milliseconds (though not sure how much that one matters, even with group-by-origin mode, since it's one fixed time, calculated well in advance of when the scripts are run).
This is exactly the point. Providing timestamp in seconds in browser signals, maybe even with random 500ms noise, would probably satisfy everyone "needing a reasonable timestamp" in generateBid. The same can be achieved using dependency on KV, or using sum with recency. But why make it so complicated for users.
Please read through the discussion in #431 if you haven't already. Observe that your suggestion "Providing timestamp in seconds in browser signals" would satisfy one of the three uses mentioned there, and would fail at the other two. This is why, when we discussed it last year, we left it in the hands of the party that was actually planning to use the information.
@michaelkleber Can you clarify the three uses you see in #431? Are you referring to the comment by @pehuen-rodriguez in which he discusses day of week?
The issue in "just passing the timestamp into the bidder" is that we would like to use the timestamp to check the staleness of values that are passed to the bidder. Ideally, we'd like to check the staleness of both the per-buyer-signals and the the trusted bidding signals against both each other and the current time in order to make replay attacks just a little bit more challenging.
I think timestamp in seconds actually satisfies all the cases in this discussion. Or at least I am not sure which one would fail. Day in user zone is maybe not easy to get, but storing offset from UTC in IG content, from ig join call time, is even easier than storing timestamp inside IG.
The goal is not so much to have super tricky bidding logic. A much bigger concern is contextual perBuyerSignals being stale for some reason (maybe a bug somewhere, including our own systems). Same with KV store responses, maybe because of caching. It is much safer to no bid by default eg. if KV store data is 10 seconds old, then to later try to resolve billing issues due to accidental spend. There is almost no way to control spend as is, and "spend by default in cached world" is a recipe for disaster, I believe.
The PA auction could include a browser-provided timestamp, e.g. browserSignals
could include a field like 'roundedDateNow': 1712622180000
, current time (in milliseconds since the epoch) rounded to one minute.
This would let you detect staleness of your signals, though you would get false positives on machines with incorrect clocks. (That is much less common today than 10 or even 5 years ago.)
It would not make the functionality of the JavaScript Date
object available — we don't have any way to do that and avoid the high-granularity timers that are a side-channel risk.
It would not directly support "date-based decisions on the bidding function", e.g. time-of-day or day-of-week restrictions on ad buying. Those kinds of useful time conversions are also part of the Date
object. Ad techs who wanted that kind of functionality would need to pass in some information about the user's timezone, and do date math themselves.
If problem 1 turns out to be something that will be fixed by Prebid.js release 8.37.0 (see https://github.com/WICG/turtledove/issues/1106#issuecomment-2038621742) and problem 3 is something buyers are likely to need to solve anyway, then roundedDateNow
does not seem like a very useful feature. But if problem 1 is a substantial enough issue even aside from old Prebid versions that it warrants a new signal, then we can indeed provide it.
I just don't want to over-promise something that will disappoint most of the people who try to use it.
I don't think problem 1 is even about the prebid issue. What if there is another bug somewhere (maybe in one of the exchanges, or in our own code, wherever) 5 months from now. Or if there is some bad actor actually trying to replay/manipulate some data for 1 minute. Our goal is to have some way to quickly reject suspicious traffic, before we actually spend any money. Even if this means false positives and some lost traffic.
The PA auction could include a browser-provided timestamp
I think this is the intention of the ask -- rather than a full-fledged date object.
The way that seems the most immune to any kind of corruption is to include a key called "time" in every Interest Group, and have your KV server return the current time in your trusted bidding signals
While this might be viable for buyers, sellers have no such key to add -- so we're still very much dependent on the brower and/or API surface to provide some indication of "when" the on-device auction is taking place compared to when the auctionConfig
was generated.
Hello,
We would like to have access to a current timestamp inside the bidding worklet. One possible approach would be to make the Date object available inside the worklet but maybe there are other ways that I am not aware of. The reason for this is that we need a way to verify that perBuyerSignals is not being cached by the browser. Current data suggests that as much as 20% of our impressions had cached perBuyerSignals at bid time. If we cannot verify that perBuyerSignals is fresh at bid time then we cannot have confidence that the information in perBuyerSignals is accurate for the current bidding opportunity. While I cannot say exactly how this will impact our bid values, it is highly likely that this will depress our bid prices, especially on higher quality publishers.
Thank you, Alex