WICG / turtledove

TURTLEDOVE
https://wicg.github.io/turtledove/
Other
539 stars 238 forks source link

Performance of running FLEDGE auctions #385

Open barteklos opened 2 years ago

barteklos commented 2 years ago

Hi,

We would like to share our initial observations with regards to the performance of running FLEDGE auctions. This topic was previously addressed (including an issue #215) but mainly in the context of running JS in a bidding worklet environment. This time we would like to discuss potential latency bottlenecks in an end-to-end runAdAuction call.

We manage to run FLEDGE auctions in a production environment (which means real publishers, real advertisers and our bidding infrastructure with our bidding logic).

In this scenario we run a Chromium browser with FLEDGE enabled, then visit an advertiser’s page (which adds us to 3 interest groups), and finally visit the publisher’s page which runs an auction for these IGs. To measure performance we take advantage of the trace event profiling tool (chrome://tracing).

Benchmark 1 (Intel Core i7-6820HQ 2.7 GHz, Linux, fast internet connection):

Screenshot from 2022-10-20 21-10-29

Benchmark 2 (Intel Core i7-4600U 2.1GHz, Windows, slow internet connection):

Screenshot from 2022-10-18 10-05-02

Our observations:

Bearing in mind that runAdAuction would be run for all participating buyers, and fetching ctx signals and ad rendering will take additional time, we are afraid that this level of latency would not be acceptable for an end user.

The latency of TBS requests could be reduced by replacing BYOS by the key-value TBS and/or by improvements proposed in #333 (TBS prefetching, caching etc.) but it is not clear what to do with the other latencies.

Are you aware of these issues? Do you have plans to address them in the future?

Best regards, Bartosz

maciejkowalczyk commented 2 years ago

It’s worth noting that this matches the results from our Fledge tests, where more than 50% of auctions running for actual users eligible for Origin Trials are taking over 1s.

We’ve been running Fledge auctions on real publishers’ pages (impressions bought via direct integration or classic RTB). Interest Groups were created in browsers visiting our partners’ pages through our production infrastructure.

Measurements done:

In order to factor out the network latency impact, we’ve aggregated only auctions where contextual requests took over 150ms. In that case, over 10% of auctions took more than 1s, which is still disturbing.

Sample size: 54406 auctions Average values around 90th percentile (about 544 auctions each, ordered by auctionTime):

    ctxTime     auctionTime ctxTillTbs  tbsReq      tbsTillReports
pct_rank                    
85  100.431066  928.118015  470.099265  14.446691   666.961397
86  101.706434  968.390993  483.262868  14.334559   714.167279
87  101.636581  1015.791912 518.959559  15.363971   717.676471
88  100.488235  1064.030331 534.306985  15.924632   795.295956
89  101.479779  1122.432353 540.909926  15.457721   824.152574
90  102.788235  1186.440074 585.933824  14.454044   855.981618
91  101.228676  1261.586029 587.205882  15.321691   975.369485
92  101.420956  1357.628125 653.231618  13.808824   1006.093750
93  101.958640  1481.707169 676.626838  14.702206   1092.836397
94  105.037132  1633.292647 723.387868  15.604779   1268.215074
JensenPaul commented 2 years ago

Thank you @barteklos and @maciejkowalczyk for sharing this insightful information. I have also been investigating a lot of time and thought into different ways to improve the latency of FLEDGE auctions. I think the results of your two benchmarks are interesting and provide further motivation for the idea I started discussing at the end of last week’s WICG FLEDGE call:

Today’s plan for FLEDGE auctions is to run the auction after the contextual request has completed. A large portion of FLEDGE’s auction latency (10-50% depending on device and network performance) is attributable to a number of things that do not depend on auctionSignals, perBuyerSignals and sellerSignals all of which could come from the contextual response, namely:

  1. IPCing the runAdAuction() call from the renderer process to the browser process,
  2. Loading up all interest groups that can participate in the auction from the browser's database,
  3. Spawning processes and services to later execute the bidders’ and sellers’ worklets,
  4. Fetching the JavaScript that should be run for the bidders and sellers,
  5. Fetching the trusted bidding signals for the bidders, and
  6. Parsing, compiling and evaluating (evaluating the script overall, not invoking the bidding and scoring functions) the Javascript for the bidders’ and sellers’ worklets.

The implication is that if auctionSignals, perBuyerSignals and sellerSignals come from the contextual response, and the rest of the auction configuration (e.g. interestGroupBuyers) does not depend on the contextual response, then the six items listed above can be executed concurrent with the contextual request.

I think we can modify the FLEDGE runAdAuction() API to allow auctionSignals, perBuyerSignals and sellerSignals to be Promises to provide these signals later (i.e. when the contextual response comes back). This would allow the browser to perform the six items listed above at the same time as the browser is waiting for the contextual response. This would potentially greatly speed up the FLEDGE auction, benefiting both buyers and sellers. This would require some changes for sellers, namely making sure they can provide the remaining parts of the auction configuration before receiving the contextual response. I think some sellers were on the last WICG FLEDGE call, but I don’t think @JoelPM was there, so I’m tagging him for feedback on the idea.

zhengweiwithoutthei commented 2 years ago

I think if we plan to change auctionSignals, perBuyerSignals and sellerSignals to be Promise, it make sense to also include 'directFromSellerSignals.

morlovich commented 1 year ago

Would you expect perBuyerSignals to be a single promise or a map of promises? (WebIDL is being a little annoying here since it can't have a promise as part of a union...)

zhengweiwithoutthei commented 1 year ago
  1. A single promise will be easier. A map of promises should work too.
  2. perBuyerTimeout needs to be promise too.
JoelPM commented 1 year ago

Our current assumption is that sellers/SSPs will construct an AuctionConfig object on the server-side. DSPs signal their desire to participate in the FLEDGE auction by responding to the contextual ad request with a buyerSignals object. The SSPs then construct the AuctionConfig object based on which buyers returned signals and then return this entire object to the browser with the contextual request. This means that currently buyer qualification for an auction is done on the server side when the contextual request is executed.

As we discussed in the WICG call, the list of eligible buyers (for a given SSP) is fairly stable and could be provided out-of-band in some way. The risk with this approach is that buyers will be initialized who don't ultimately want to participate - i.e. they are eligible, the browser does the init work for the IGs, but they don't signal interest in participating through the contextual auction. In this case the perBuyerSignals promise would return null and I don't know if the buyer would still have their bidding functions called without any signals (seems wasteful) or not get called at all (still wasteful, but less so). Without knowing the eventual usage patterns the only safe assumption is that if you pick an optimization now it will probably need to get tweaked in the future.

For what it's worth, here is a sequence diagram that shows the interaction of GAM, Prebid.js, FLEDGE, and Sellers:

Prebid Fledge Sequence Diagram

michaelkleber commented 1 year ago

Thank you Joel. I think that the browser initializing buyers who end up not returning contextual signals is likely to be a worthwhile trade-off here. Of course the only extra work here would be for buyers who have actually stored IGs in this particular browser! And in the future we could consider prioritizing IGs by first setting up the ones that are most likely to bid based on historical data.

The upshot is, I do think it is quite worthwhile to figure out how a prebid-integrated SSP could provide its list of buyers earlier, ideally as part of either step "2. loads & configures Prebid.js" or step "6. invokes bid adapters".

morlovich commented 1 year ago

Thanks. Does this mean that if one has component auctions, that the information for filling in things in their nested AuctionConfig is going to come in independently, or is it likely that the top-level seller will have to be involved somehow?

Basically I am trying to figure out how much fine-grained tracking of when information comes in is worthwhile or if doing potentially simpler stuff would be just as good.

caraitto commented 1 year ago

@zhengweiwithoutthei FYI, DirectFromSellerSignals has some natural parallelism already due to using subresource bundles -- as soon as the <script type="webbundle"> tag is present in the DOM, the network fetch of the bundle file will start in the background, in parallel with the runAdAuction() call.

When the worklet goes to fetch subresources in the bundle file, that load will block until the bundle has loaded, completing immediately if the bundle has already loaded.

(Note that the bundle file doesn't need to have fully loaded, just the portion of the bundle that contains the subresource; so subresources declared earlier are loaded first).

Where promises might still be useful for DirectFromSellerSignals is if there's a desire to call runAdAuction() before the DirectFromSellerSignals prefix is known [0] -- the prefix would be provided later via promise resolution. Is this latter notion something you'd like to have supported, or is the natural parallelism of subresource bundles sufficient?

@morlovich for visibility

[0] Recall that varying the prefix can be used, for instance, to provide different signals for different ad-slots on the page.

morlovich commented 1 year ago

If people are able to build working chromium from source + a bunch of patches, the CL chain at https://chromium-review.googlesource.com/c/chromium/src/+/4114383 may be worth experimenting with. It's pretty undercooked (please see warnings in the description) but seems to largely work from basic hand-testing.

morlovich commented 1 year ago

To apply them to a fresh checkout, something like this should work:

git checkout -b promise-experiment
git cl patch 4103743
git cl patch 4111349
git cl patch 4114383
zhengweiwithoutthei commented 1 year ago

@morlovich Cool! We are happy to give it a try.

@caraitto You are right. It requires the caller to know the path for the resource in the bundle that corresponds to individual slot in advance. It is not currently the case but I think maybe we can come up a solution. Ideally, we would want it to be Promise as well.

morlovich commented 1 year ago

So I've stumbled on a way of make it easier for you to test it; from now on https://chromium-review.googlesource.com/c/chromium/src/+/4120733 should have the combination of all the CLs I am working on to provide this promise + parallelism functionality.

To apply them to a fresh checkout you should be able to just do:

git cl patch -b promise-experiment2 4120733

This should hopefully reduce the chance of anyone running into intermediate broken states and the like, as those things are very much in development.

Edit: should be "git cl patch", not "git patch"

zhengweiwithoutthei commented 1 year ago

What will be the type of those signals in the auctionConfig when reporting functions are called, e.g. reportResult(auctionConfig, browserSignals, directFromSellerSignals). Will they be converted to their resolved value or still being Promise?

morlovich commented 1 year ago

Resolved.

https://source.chromium.org/chromium/chromium/src/+/main:content/browser/interest_group/interest_group_browsertest.cc;drc=0e340e634f1a6051908f54175761485f28f985f0;l=7874 may be sort of readable as an example?

morlovich commented 1 year ago

Basic support for promises for auction_signals, seller_signals, per_buyer_signals, and per_buyer_timeouts should be available in canary starting from 111.0.5539.0

morlovich commented 1 year ago

So looking into implementing directFromSellerSignals, and I have a question for folks who might use it --- do you expect it to be available at the same time as other signals? (e.g. would you foresee a scenario where it's not a promise and other things are, of where it's a promise that's resolved at substantially different time from others).

zhengweiwithoutthei commented 1 year ago

The answer depends on how the directFromSellerSignals is used. Some approaches doesn't require it to be a promise, some does. For one of the main uses cases that we are pursuing , directFromSellerSignals signal will be available and resolved at the same time as other signals.

Looping back to @caraitto's comment. I agree that directFromSellerSignals does not necessarily need to be a promise if the prefix can be known at the time runAdAuction is called. In the case of parallelization, this happens before the contextual request is sent. Currently we do not have a way to know the prefix ahead of time.

dmdabbs commented 1 year ago

@morlovich you earlier asked whether the top-level seller will have to be involved somehow.

Assuming that the top-level seller is responsible for assembling pre-conditions for its component sellers' auctions, in addition to those sellers providing their config JSON they will also need to supply their bundle script(s) markup so the top seller can get them on the DOM. Would it make sense to supply these via an optional component auction attribute or better handled out-of-band of the child configs?

If there are many resources declared, would bundle scopes be better, or does the browser prefer to have the explicit forward declarations?

If a seller/buyer fails the auction because critical bundle resource(s) fail to load, would be nice to be able to track/count these via Extended PA Reporting.

<!-- 
  topWindowHostname: www.example-publisher.com
-->
<script type="webbundle">
    {
      "source": "https://www.example-ssp.com/fledge/dfss/?gen=signals&www.example-publisher.com&pubid=1234",

      "resources": [
        "https://www.example-ssp.com/fledge/dfss/signal?sellerSignals", 
        "https://www.example-ssp.com/fledge/dfss/signal?auctionSignals"
        /* possibly additional resources registered for per-slot variations, &c. */...
      ]
    }
</script>

<script type="webbundle">
    {
      "source": "https://www.example-ssp.com/fledge/dfss/?gen=buyer&host=www.example-publisher.com&pubid=1234",

      "resources": [
        "https://www.example-ssp.com/fledge/dfss/signal?perBuyerSignals=https%3A%2F%2Ffledge.buyer-a.com", 
        "https://www.example-ssp.com/fledge/dfss/signal?perBuyerSignals=https%3A%2F%2Ffledge.dsp-partner.com", 
        "https://www.example-ssp.com/fledge/dfss/signal?perBuyerSignals=https%3A%2F%2Ftd.anotherdsp.com"
        /* possibly additional resources registered for per-slot variations, &c. */...
        ]
    }
    /*
        If there are many resources declared, would scopes be better?

        Subresource Loading with Web Bundles: https://github.com/WICG/webpackage/blob/main/explainers/subresource-loading.md
    */
</script>

<script>
    const myComponentAuctionConfig = 
    {
      'seller': 'https://www.example-ssp.com',

      /* 
        an HTTPS URL prefix (without query string) using the seller's origin
        when combined with a browser-provided suffix, the resultant URL should be a resource in a subresource bundle that has been loaded by the current document
        (e.g. above)

        signals may come from separate bundle files, but each bundle must be served from the seller's origin

        See https://github.com/WICG/turtledove/blob/main/FLEDGE.md#25-additional-trusted-signals-directfromsellersignals
      */
      'directFromSellerSignals': 'https://www.example-ssp.com/fledge/dfss/signal',

      ...

      'componentAuctions': [
            {
                'seller': 'https://www.example-ssp.com',
                'directFromSellerSignals': 'https://www.example-ssp.com/fledge/dfss/signal',

                "interestGroupBuyers": [
                    'https://fledge.buyer-a.com',
                    'https://fledge.dsp-partner.com',
                    'https://td.anotherdsp.com'
                ],

                "perBuyerSignals": {
                    "https://fledge.buyer-a.com":{...},
                    "https://fledge.dsp-partner.com":{...},
                    "https://td.anotherdsp.com":{...}
                },

                'perBuyerTimeouts': {...},
                'perBuyerGroupLimits': {...},
                'perBuyerPrioritySignals': {...},
                'perBuyerExperimentGroupIds': {}, 
                ...
            },
            {
                'seller': 'https://www.someother-ssp.com',
                'directFromSellerSignals': 'https://www.someother-ssp.com/fledge/foo/',

                "interestGroupBuyers": [
                    'https://fledge.buyer-a.com',
                    'https://fledge.dsp-partner.com',
                    'https://foo-adtech.com'
                ],
                ...
            }
        ]
    };
    const auctionResultPromise = navigator.runAdAuction(myComponentAuctionConfig);
</script>
dmdabbs commented 1 year ago

directFromSellerSignals client fetches would be mooted by the server approach.

morlovich commented 1 year ago

I've landed direct_from_seller_signals promise support yesterday, but looks like it hasn't quite made canary yet.

https://github.com/WICG/turtledove/pull/453 is an explainer pull request for this stuff.