Open bretg opened 1 month ago
The PBS-Go team is curious what A/B testing strategy is being used and whether this is something that applies to all modules or not.
what A/B testing strategy is being used
Just a simple run-or-not-run strategy. I suppose it would be more powerful to allow for different configurations (e.g. timeouts, cache sizes, etc), but the main point here is to be able to know whether it's worthwhile paying a given vendor at all, not to fine-tune the behavior. If their system is complicated enough to need that kind of tuning, they should include their own A/B testing facility inline to their module.
Here's a rough proposed implementation.
The test is enabled at the module-level, not the hook level.
{
"hooks": {
"modules": {
"my-module": {
"params-seen-by-module": { ... }
}
},
"execution-plan": {
"abtests": [{
"module-code": "my-module",
"enabled": true,
"percent-active": 5,
"log-analytics-tag": true
},{
... abtest config for other modules ...
}],
"endpoints": {
"/openrtb2/auction": {
...
}
}
}
}
}
If any abtests object is enabled and flagged with log-analytics-tag as true, PBS would log an atag activity object
{
activities: [{
name: "core-module-abtests",
status: "success",
results: [{ // one results object for each module in the abtests object
"status": STATUS, // "run" or "skipped"
"values": {
"module": "my-module"
}
},{
... the status of other abtest decisions ...
}]
}]
}
I think they can do what they need to with this data, which is just to log to their endpoint whether a given module was active or not.
Please consider a simplification:
abtests
object under the execution-plan
- perhaps it may be enough to just specify a single rollout_fraction
parameter under each module, like: {
"hooks": {
"modules": {
"my-module": {
"enabled": true
"rollout_fraction": 0.5 // a float between 0.0 and 1.0
"params-seen-by-module": { ... }
}
},
...
rollout_fraction
is treated as a p
parameter for a simple Bernoulli(p)
distribution (with discrete 0 or 1 outcomes) from which we'd sample whether to run the module for this particular request or not..
HookExecutionOutcome
which would contain the module if it was run. Thus can be used instead of analytics tag as a marker of the treatment variant. If the module was not run for a given request - we'd simply not have it listed in the HookExecutionOutcome
which would indicate the control
variant. Other custom markers including custom analytics tags may be used and are subject to the particular module implementation.This of course works only if HookExecutionOutcome
is picked up by the analytics pipeline and transferred to the data warehouse to be used as part of further analysis including A/B test results, which seems to be a reasonable assumption to make.
Thus the whole thing comes down to implementing rollout_fraction
as an optional module config param and associated decision making logic based on sampling from a Bernoulli(rollout_fraction) distribution.
If this were done via HTTP headers then Content Security Policy might provide a pattern to emulate substituting domain name for module common name for the module's key. The name of the header could be Prebid-Config
.
In general I'd prefer to put core control mechanisms in a place where the module code can't see it.
I get that the 'enabled' field is an exception. I believe the module can see everything under hooks.modules.my-module. I don't want to add another level (e.g. params-seen-by-module) because we already have existing modules in the wild that don't put things there.
So we'd need to come with a reserved word where we place PBS-core control functionality. e.g. hooks.modules.my-module.module-controls.percent-enabled.
Rather than have each module support an ability to A/B test, it would be convenient for PBS-core to support enabling modules in a partial way.
We need to work out the syntax and the analytics.