Closed kingo55 closed 4 years ago
@kingo55 I think it's fesible and it isn't hard to implement.
Good suggestion, happy for it to be added as a feature.
I think the best approach for this would be to:
Mojito.options.decisionAdapter = function(testObject, decision) {
if (decision = 'sampleRate'){
test.options.decision = someRNG;
return hexToInt(test.options.decision[0:5]);
}
if (decision = 'recipe') return hexToInt(test.options.decision[6:12]);
};
If the decisionAdapter
is not set, we can defer to the old seeded random function in the library currently. Anything else I might have forgotten?
@allmywant could we save the MD5 hash in the testObject for re-use?
@kingo55 Yes we can save MD5 hash in the testObject.
Considering we need fresh entropy for both:
How should we best handle for that in the code without adding more unnecessary complexity?
I figured one way is to use the decision
parameter. But another way I thought of is to just increment a key on the testObject
? e.g.:
function rng(testObject) {
...
testObject.options.decsionCnt ++;
return randomNumber;
}
That way, sample rate inclusion is always testObject.options.decsionCnt === (0 || null)
but recipe selection is always testObject.options.decsionCnt > 0
.
@kingo55 this would be a bit clearer with a decision
parameter. I don't think it would add too much more complexity for what it's trying to achieve.
@allmywant @dapperdrop - I have been thinking about this more and think the cleanest way forward is to increment a value on the testObject
for additional decisions.
I think we should just pass in the testObject
as the sole parameter to the decisionAdapter
.
@kingo55 cool - happy to go with your direction on this.
Ah whoops - should have refreshed the page. Only just saw your comment @dapperdrop.
@allmywant - have you got any further thoughts on this? Ultimately, a decision parameter affords more flexibility if needed down the track.
@kingo55 We may consider if we shoul allow the user to generate the custom MD5 hash in their code, that way the user has the full control on getRandom function.
@allmywant - that is the plan. The decisionAdapter should just return a number between 0 and 1.
We just need to give the user the proper tools to generate decisions for tests. testObject
is a useful tool for building the seed. Would specifying the decision
be useful in any way? I'm not so sure.
@kingo55
Got it. Then I think we don't need the decision
parameter, it's clearer than sole testObject
parameter but we don't know if user need other parameters in the future. So I think we can just pass the sole testObject
parameter and prepare any needed properties in testObject
, e.g. we can set testObject.currentDecision = 'SimpleRate'|'Recipe'
before we call the decisionAdapter
:
Mojito.options.decisionAdapter = function(testObject) {
if (testObject.currentDecision == 'sampleRate'){
test.options.decision = someRNG;
return hexToInt(test.options.decision[0:5]);
}
if (testObject.currentDecision == 'recipe') return hexToInt(test.options.decision[6:12]);
};
@allmywant - got it... that sounds like a good solution.
Since the decisions are always made in the same sequential order, shouldn't we just increment a decision count? Are there any cases where we would rather a string
?
@kingo55
The only reason I can think of is string
is more clear than decisionCount >0
:)
@allmywant - that sounds like a good reason. Perhaps we go with a string in that case then.
Are you able to take this one on David?
@kingo55 Yes, I will.
@kingo55
Almost done, I've sent a pull request.
BTW: I used decsionCnt
solution rather than string
finally, I realized getRandom
might be called more than 3 times if a test has more than 2 recipes and each recipe has the same simpleRate
(e.g. 0.5). In that case, decsionCnt
is better.
@allmywant - Ah yes... that's a tough one.
Could it be called more than 4 times for an experiment? We may need to handle for that, or warn the user for that. Alternatively, could we make it so simpleRate
doesn't hit the decisionAdapter
so much?
This will make it harder for users to derive the decisions during analysis too.
@kingo55
I have read the source code again, I can confirm that it will be called more than 4 times for an experiment in a edge case. Here the code how to choose a recipe of an experiment that has simpleRate
defined in recipes.
var orderedRecipes = this.getOrderedBySampleRecipes();
var sameSimpleRateRecipes = this.getSameSimpleRateValueRecipes(orderedRecipes);
var weight = 0;
var lastSample = orderedRecipes[orderedRecipes.length - 1].sampleRate;
while (!chosen_recipe)
{
weight = this.getRandom();
for (var i = 0; i < orderedRecipes.length; i++)
{
if (weight <= orderedRecipes[i].sampleRate)
{
chosen_recipe = orderedRecipes[i].key;
sameRecipes = sameSimpleRateRecipes[orderedRecipes[i].sampleRate];
if (sameRecipes)
{
partitions = 1.0 / sameRecipes.length;
chosen_partition = Math.floor(this.getRandom() / partitions);
chosen_recipe = sameRecipes[chosen_partition].key;
}
break;
}
}
if (!chosen_recipe && (weight - lastSample) <= (1 - lastSample))
{
chosen_recipe = orderedRecipes[orderedRecipes.length - 1].key;
sameRecipes = sameSimpleRateRecipes[orderedRecipes[orderedRecipes.length - 1].sampleRate];
if (sameRecipes)
{
partitions = 1.0 / sameRecipes.length;
chosen_partition = Math.floor(this.getRandom() / partitions);
chosen_recipe = sameRecipes[chosen_partition].key;
}
}
}
Consider this experiment:
state: "inactive"
sampleRate: 0
id: "w134"
name: "W134 Summer sale"
recipes:
"0":
name: "Control"
simpleRate: 0.1
"1":
name: "Treatment"
simpleRate: 0.4
"2":
name: "Treatment"
simpleRate: 0.5
trigger: "trigger.js"
In the while
loop, we pick a random weight = this.getRandom();
then compare to each recipe's simpleRate, if matched (if (weight <= orderedRecipes[i].sampleRate)
) then choose the recipe.
The edge case is:
first loop, this.getRandom() returns 0.8
second loop, this.getRandom() returns 0.9
third loop, this.getRandom() returns 0.7
4th loop, this.getRandom() returns 0.6
5th loop, this.getRandom() returns 0.77
...
All of above do not match: if (weight <= orderedRecipes[i].sampleRate)
So, that is the edge case. I've been thinking about this from last evening, how about we change it this way:
1 just call weight = this.getRandom()
once, then loops in recipes and compare with if (weight <= orderedRecipes[i].sampleRate)
, if matched then all good.
Otherwise, we compare the weight with recipe's simpleRate, but condition changed to:
which recip's simpleRate is most close to the weight:
min(Math.abs(weight - recipe.simpleRate))
In this way, we can limit the getRandom calls of an experiment to <= 3. The third one is needed when an experiment has more than one recipes with same simpleRate value and we need to pick one of them randomly.
@allmywant I'm a bit confused by this logic.
I would have thought the following loop would be sufficient:
weight = this.getRandom();
var checkedRecipeSamples;
for (var i = 0; i < orderedRecipes.length; i++)
{
checkedRecipeSamples += orderedRecipes[i].sampleRate;
if (weight <= checkedRecipeSamples)
{
chosen_recipe = orderedRecipes[i].key;
break;
}
}
Also, I don't know why the recipes need to be ordered here.
EDIT: Added sum of checkedRecipeSamples
Just had to add a tally of the checkedRecipeSamples here.
@kingo55 The loop you mentioned is better. The reason I sort recipes by simpleRate is: for below recipes: { name: Control, simpleRate: 0.8} { name: Treatment, simpleRate: 0.2}
If we don't sort them, and let's say weight = this.getRandom() = 0.1, the Control will be chosen. But I think the Treatment should be chosen, right?
@allmywant - ah yes, excellent point... I see why the ordering exists now.
If you think you can reduce the number of calls to getRandom and make this function more efficient, can you please submit a PR @allmywant ?
Sure @kingo55 I will.
Should we let users override the default random function?
E.g. Like how we manage custom storageAdapters. Maybe like a randomAdapter or assignmentAdaptor.
This way users won't be fenced in to use our recommended random method. Also, if users need to change random methods mid-way through experiments, they can retain the method of assignment on active tests but use the new random methods on new experiments. Another benefit, users who don't care about MD5 hashes as RNG, won't bog down the weight of their library with large functions.