Floors module enhancement: support hasUserId dimension

bretg commented 1 year ago

Type of issue

enhancement

Description

In the interest of improving publisher and floor-provider flexibility, we want to add the ability to set different floors for scenarios where there's a user ID available vs when there's not.

While it's understood that this is already possible through custom floors dimensions in Prebid.js, it's not possible in Prebid Server, and is worth community discussion.

hasUserId floor dimension

In addition to the existing dimensions (AdUnit, GPT Slot Name, MediaType, Ad Size, Domain, etc), we propose a new schema dimension that can be used by floors providers: hasUserId: true or false.

If there are any Extended IDs (EIDs) available to this bidder, hasUserId would match the true value.
If there are no EIDs available for this bidder, hasUserId would match the false value.

And of course floor enforcement for a bidder is done against the value of the floor that's sent to the bidder.

For example, say this is the floors config:

floors: {
   currency: 'USD',
   skipRate: 5,
   modelVersion: 'Sports Ad Unit Floors',
   schema: {
       fields: [ 'mediaType', 'hasUserId']
   },
   values: {
       'banner|true': 2.0,     // hold out for a higher floor if EIDs are available
       'banner|false': 1.5
   }
}

And this is the EIDs config:

pbjs.setConfig({
    userSync: {
        userIds: [{
            name: "MODULE NAME",
            bidders: ['bidderA'],             // only bidder A gets this EID
            params: {
                ...
            }
        }]
    }
});

In this example, bidderA's floor would be 2.0 assuming the EID is actually set, and the floor for bidderB would be 1.5.

Prebid Floor Rule

We ought to discuss the rules for Prebid Floors. Originally, there was a simple rule:

all bidders get the same effective floor.
It might be modified by bid adjustments, but it's still effectively the same across bidder.

Then https://github.com/prebid/Prebid.js/issues/10410 came along and changed the rule to:

all bidders either get the same effective floor or they get no floor at all

And now this new dimensions prompts a generalization of the essence of the rule:

Prebid won't support floor attributes that single out bidders by name, except for the case of not passing them a floor at all.

Prebid should support floor attributes that separate out bidders by auction characteristics.

Commentary

The spirit of the original floors rule was that we don't want publishers accepting only high-value bids from a given bidder for any reason. Prebid should be a level playing field where bidder name is not a factor in setting a floor.

However, publishers ought to be able to tune their floors based on bidder behavior and information given to them.

In the example above, it's not bidderA's name that's making them jump a higher bar, it's that they have access to information that other bidders don't have, so is expected to be able to bid more. In scenarios where an EID is not available, the floor for bidderA and bidderB would be the same.

patmmccann commented 1 year ago

If there are any Extended IDs (EIDs) available to this bidder, hasUserId would match the true value. If there are no EIDs available for this bidder, hasUserId would match the false value.

This seems not well chosen; for many publishers there will always be a sharedid.

bretg commented 1 year ago

Your counter-proposal would be to say "hasUserId" means any EID beyond sharedId? Or go super-surgical and make it highly configurable by the floors provider?

e.g. includesUserId: ["idA", "idB", "idC"]

That seems a little too much?

patmmccann commented 1 year ago

if the goal is to identify if the user is addressable, acceptance of third party cookies or a deterministic identifier are the best choices. Many id modules always answer and just entirely break the logic in your proposal.

patmmccann commented 8 months ago

It seems the two id-related factors driving cpm the most are acceptance of 3rd party cookies or the presence of certain eids (eg rampid, uid2) that are email based. I think we still need a proposal following these cuts. I think 'e.g. includesUserId: ["idA", "idB", "idC"]' makes a lot of sense. if I understand Bret's proposal, the outcome would be binary, rather than having different floors for all combinations, one would split floors on 'hasValuableID' or not, right?

robertrmartinez commented 6 months ago

Hello all!

Allow me to chime in on a proposed solution(s)

There are many ways to do so, here are a couple examples:

New attribute called includesUserIds
- Floor rule gives comma separated list of userIds it wants to check if are present:

        {
            "modelWeight": 50,
            "modelVersion": "List of userIds present",
            "schema": {
                "fields": [
                    "mediaType",
                    "includesUserIds"
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|liveintent,pairId,sharedId": 0.83,
                "banner|liveintent": 0.68,
                "banner|sharedId": 0.3,
                "banner|*": 0.1
            }
        }

Floors supports generic userId.$$$ attribute which can be filled out with whatever userId we wish to floor on the presence or not:

        {
            "modelWeight": 50,
            "modelVersion": "Each User ID own Field",
            "schema": {
                "fields": [
                    "mediaType",
                    "userId.liveintent",
                    "userId.pairId",
                    "userId.sharedId"
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|1|1|1": 0.43,
                "banner|1|1|0": 0.43,
                "banner|1|0|1": 0.43,
                "banner|0|1|1": 0.43,
                "banner|1|*|*": 0.43
            }
        }

Floor provider declares a list of valuableIds and floors based on new attribute hasValuableId

         {
            "modelWeight": 50,
            "modelVersion": "Each User ID own Field",
            "valuableIds": [
              "liveintent",
              "pairId"
            ],
            "schema": {
                "fields": [
                    "mediaType",
                    "hasValuableId"
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|1": 0.43,
                "banner|0": 0.43,
                "banner|*": 0.43
            }
        }

FWIW I think 1 or 2 are the more flexible approaches and better (With number 2 being my favorite due to its shortened number of characters total in a large rule file)

But number one may be the most explicit and easiest to integrate with.

NOTE: This is a proposal to add this to the default "supported" fields in the priceFloors module. All of these implementations can already be accomplished using the additionalSchemaFields hook, (You can floor on anything with this hook)

Curious to hear peoples feedback @patmmccann @bretg

bretg commented 6 months ago

Thanks for trying to move this forward @robertrmartinez . One issue is in how to identify different userIds. There's no standard, but there ought to be one so that a floors provider can build floors files for both PBJS and Prebid Server without disjoint IDs for the IDs.

The table at https://docs.prebid.org/dev-docs/modules/userId.html#prebidjs-adapters shows 2 possible ways to identify

Prebid.js Attr bidRequest.userId -- this is not available on the server side. (e.g. adtelligentId, lipb.lipbid)
EID Source - this is available server-side. (e.g. adtelligent.com, liveintent.com)

So either the client-side floors module should refer to the "EID source" to name the user IDs or the server-side floors module is going to have to map from EID source to the arbitrary mess that is bidRequest.userId. FWIW, Magnite already had to implement this mapping for analytics purposes, so it's mostly possible, though annoying and will require maintenance.

Are there runtime implications for any of these choices? My understanding of the runtime lookup makes me lean towards option 2.

robertrmartinez commented 6 months ago

Ah yes forgot to add a bit on this good call.

I think the right thing is to use EID Source (Even though I have already made the mistake of using Option 1)

It is just much more consistent, even if less readable at times where company names change all the time!

If needed, floor and analytic providers can / should do the mapping on their end when showing to users I guess. (We prob will need to do so...)

patmmccann commented 6 months ago

We decided to do at least option 2 in https://github.com/prebid/Prebid.js/issues/10617#issuecomment-2061614167 and also option 3 conditional on the implementor wanting to take that on

dgirardi commented 6 months ago

A generalization of (3) could look like:

         {
            "modelWeight": 50,
            "modelVersion": "ID buckets",
            "userIds": {
                 "tierOne": [
                       "some.excellent.id",
                       "another.good.one"
                 ],
                 "tierTwo": [
                       "less.valuable.id"
                  ]
            },
            "schema": {
                "fields": [
                    "mediaType",
                    "userId.tierOne",
                    "userId.tierTwo",
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|0|0": 0.1,  // none from either tierOne or tierTwo
                "banner|0|1": 0.2,  // none from tierOne, one from tierTwo
                "banner|2|*": 0.4   // two from tierOne, any number from tierTwo
            }
        }

If the criteria is more likely to be "if any of these IDs" (rather than "if at least N of these IDs"), the calculated value could be kept a binary 1/0 for any/none. If both are valuable (and we want to optimize size) we could also do:

         {
            // ...
            "schema": {
                "fields": [
                    "mediaType",
                    "userId.any.tierOne",
                    "userId.count.tierTwo",
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|0|0": 0.1,  // none from either tierOne or tierTwo
                "banner|0|1": 0.2,  // none from tierOne, one from tierTwo
                "banner|1|*": 0.4   // at least one from tierOne, any number from tierTwo
            }
        }

robertrmartinez commented 6 months ago

Nice!

Willing to implement both Option 2 and @dgirardi suggestion here.

@patmmccann @bretg Any input?

Specifically around the tierOne / tierTwo proposal. Is A or B better?

A: Value is has this many ids in the tierList -> 1: has one of this tier of ids -> 2: has two of this tier of ids -> 0: has none of them -> *: 0+

B: Value is Boolean: -> 1: has any -> 0: has none -> *: either

Once chosen I think we can do a single PR which introduces BOTH Option 2:

        {
            "modelWeight": 50,
            "modelVersion": "Each User ID own Field",
            "schema": {
                "fields": [
                    "mediaType",
                    "userId.liveintent",
                    "userId.pairId",
                    "userId.sharedId"
                ],
                "delimiter": "|"
            },
            "values": {
                "banner|1|1|1": 0.43,
                "banner|1|1|0": 0.43,
                "banner|1|0|1": 0.43,
                "banner|0|1|1": 0.43,
                "banner|1|*|*": 0.43
            }
        }

And the tier one

Or maybe we just want to only support one?

Thanks!

bretg commented 6 months ago

My take is that Demetrio's "Option 4" is the best compromise. It handles all the use cases.

I think the value should be boolean: any ID existing in that tier means that the bid value should be higher, therefore the floor might be raised. I don't see a use case where having 2-of-4 IDs in a tier should result in a different floor from the scenario where there's 3-of-4 IDs available. This is already complicated enough.

robertrmartinez commented 6 months ago

Ok - so your input is we should ONLY implement Option 4 - Boolean?

I am fine with that.

Can workaround the limitation by having many tiers ;)

bretg commented 6 months ago

Yes, only 4. I can't see a use case where it makes sense to set individual floors for 50 different user IDs.

prebid / Prebid.js