Because of historical reasons, we have been dealing with design spaces where each individual prospect is relatively simple.
For non-risky prospects then we have a certain outcome
Probability
reward
delay
1
RB
DB
For risky prospects we have only two outcomes
Probability
reward
delay
PB
RB
DB
1-PB
0
0
The problem is that the 'secondary' reward is assumed to be zero. This assumption is built into the design generation code and the modelling code. So close, but the current code is not able to do these kinds of composite gambles more generally.
I did not have the foresight to implement the most general solution, which would be a full reward distribution table where there are N outcomes:
Probability
reward
delay
P[0]
R[0]
D[0]
P[1]
R[1]
D[1]
P[2]
R[2]
D[2]
...
...
...
P[N]
R[N]
D[N]
Reasons to do this
It would allow you to do Holt & Laury (2002) style gambles. This exact paper shows a fixed design, but you could apply Bayesian Adaptive Design to a larger design space if you wanted.
E.g.
What would it take (early thoughts)
At the moment the design space is a pandas table. Each column is a design dimension and each row is a particular design. If we were doing distributions then the table of designs would need to be something like
design
ProspectA
ProspectB
0
distribution
distribution
1
distribution
distribution
2
distribution
distribution
...
...
...
where distribution is a reward distribution table (or object) like above
You'd then need to change how models interacted with the designs.
You'd have to think a bit to make sure all the models make sense with reward distributions and ensure that this is a well defined general operation
This amounts to something quite simple. Rather than each element in the design table being a scale, it simply becomes an array. Such as:
design
RA
DA
PA
RB
DB
PB
0
list
list
list
list
list
list
1
list
list
list
list
list
list
2
list
list
list
list
list
list
...
...
...
...
...
...
...
So basically each attribute (eg reward or delay) is going to be represented by a row vector. Where we have many designs, it will be a 2D matrix of size [number of alternatives, number of design].
This is pointing away from representing designs using a Pandas DataFrame and towards simple numpy arrays (RA, DA, PA, ...) which could be attributes of a design class (to allow for dot indexing). This could avoid some of the faff I've experienced getting stuff into and out of DataFrames.
random initial test
UPDATE (28th August 2019)
I know how to do this now. If we do not opt for full reward distributions and instead opt for prospects of the form:
P% chance of R1 in D1 days or 1-P% chance of R2 in D2 days
then we can do this in a really very simple way. All you need to do is to add more design variables. So for our 2 choice tasks this means we would have the following design variables:
1.PA1 the chance of getting reward 1 for choice A
RA1 reward 1 for choice A
DA1 delay 1 for choice A
RA2 reward 2 for choice A, which happens with probability 1-PA1
DA2 delay 2 for choice A
PB1 the chance of getting reward 1 for choice B
RB1 reward 1 for choice B
DB1 delay 1 for choice B
RB2 reward 2 for choice B, which happens with probability 1-PB1
DB2 delay 2 for choice B
Having 10 design variables is quite a lot. But the point is we are not actually optimising over all these 10 dimensions. For risky choice for example, then all delays are going to be zero. So the design variables would be:
1.PA1 the chance of getting reward 1 for choice A
RA1 reward 1 for choice A
RA2 reward 2 for choice A, which happens with probability 1-PA1
PB1 the chance of getting reward 1 for choice B
RB1 reward 1 for choice B
RB2 reward 2 for choice B, which happens with probability 1-PB1
DA1 always equal to 1
DA2 always equal to 1
DB1 always equal to 1
DB2 always equal to 1
So just 6 (free) design variables. We could also add various constraints in the code which generates the full data frame of designs. Examples would include constraining RA1 > RA2 and RB1 > RB2 just to cut down on the total number of designs (rows).
The Holt & Laury gambles have even more constraints. So that would look like:
1.PA1 the chance of getting reward 1 for choice A [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
RA1 reward 1 for choice A (always $2)
RA2 reward 2 for choice A, which happens with probability 1-PA1 (always $1.60)
PB1 the chance of getting reward 1 for choice B [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
RB1 reward 1 for choice B (always $3.85)
RB2 reward 2 for choice B, which happens with probability 1-PB1 (always $0.1)
DA1 always equal to 1
DA2 always equal to 1
DB1 always equal to 1
DB2 always equal to 1
The other main change would be that we would have to create models which could deal with these design spaces. Specifically, we will have to update the equations calculating the present subjective utility. This might mean we get some clashes if a user builds a design space with composite gambles but uses it with a model which only deals with simple gambles. So we can deal with this either by 1) creating new models for the composite gambles or 2) by altering the models so they are sensitive to the form of the design space. Of these, the former probably seems best.
Because of historical reasons, we have been dealing with design spaces where each individual prospect is relatively simple.
For non-risky prospects then we have a certain outcome
1
RB
DB
For risky prospects we have only two outcomes
PB
RB
DB
1-PB
0
0
The problem is that the 'secondary' reward is assumed to be zero. This assumption is built into the design generation code and the modelling code. So close, but the current code is not able to do these kinds of composite gambles more generally.
I did not have the foresight to implement the most general solution, which would be a full reward distribution table where there are
N
outcomes:P[0]
R[0]
D[0]
P[1]
R[1]
D[1]
P[2]
R[2]
D[2]
P[N]
R[N]
D[N]
Reasons to do this
E.g.
What would it take (early thoughts)
distribution
distribution
distribution
distribution
distribution
distribution
where
distribution
is a reward distribution table (or object) like aboveThis amounts to something quite simple. Rather than each element in the design table being a scale, it simply becomes an array. Such as:
So basically each attribute (eg reward or delay) is going to be represented by a row vector. Where we have many designs, it will be a 2D matrix of size [number of alternatives, number of design].
This is pointing away from representing designs using a Pandas DataFrame and towards simple numpy arrays (RA, DA, PA, ...) which could be attributes of a design class (to allow for dot indexing). This could avoid some of the faff I've experienced getting stuff into and out of DataFrames.
random initial test
UPDATE (28th August 2019)
I know how to do this now. If we do not opt for full reward distributions and instead opt for prospects of the form:
then we can do this in a really very simple way. All you need to do is to add more design variables. So for our 2 choice tasks this means we would have the following design variables: 1.
PA1
the chance of getting reward 1 for choice ARA1
reward 1 for choice ADA1
delay 1 for choice ARA2
reward 2 for choice A, which happens with probability1-PA1
DA2
delay 2 for choice APB1
the chance of getting reward 1 for choice BRB1
reward 1 for choice BDB1
delay 1 for choice BRB2
reward 2 for choice B, which happens with probability1-PB1
DB2
delay 2 for choice BHaving 10 design variables is quite a lot. But the point is we are not actually optimising over all these 10 dimensions. For risky choice for example, then all delays are going to be zero. So the design variables would be: 1.
PA1
the chance of getting reward 1 for choice ARA1
reward 1 for choice ARA2
reward 2 for choice A, which happens with probability1-PA1
PB1
the chance of getting reward 1 for choice BRB1
reward 1 for choice BRB2
reward 2 for choice B, which happens with probability1-PB1
DA1
always equal to 1DA2
always equal to 1DB1
always equal to 1DB2
always equal to 1So just 6 (free) design variables. We could also add various constraints in the code which generates the full data frame of designs. Examples would include constraining
RA1 > RA2
andRB1 > RB2
just to cut down on the total number of designs (rows).The Holt & Laury gambles have even more constraints. So that would look like: 1.
PA1
the chance of getting reward 1 for choice A[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
RA1
reward 1 for choice A (always$2
)RA2
reward 2 for choice A, which happens with probability1-PA1
(always$1.60
)PB1
the chance of getting reward 1 for choice B[0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
RB1
reward 1 for choice B (always$3.85
)RB2
reward 2 for choice B, which happens with probability1-PB1
(always$0.1
)DA1
always equal to 1DA2
always equal to 1DB1
always equal to 1DB2
always equal to 1The other main change would be that we would have to create models which could deal with these design spaces. Specifically, we will have to update the equations calculating the present subjective utility. This might mean we get some clashes if a user builds a design space with composite gambles but uses it with a model which only deals with simple gambles. So we can deal with this either by 1) creating new models for the composite gambles or 2) by altering the models so they are sensitive to the form of the design space. Of these, the former probably seems best.
References
Holt, C. A., & Laury, S. K. (2002). Risk aversion and incentive effects. American Economic Review, 92(5), 1644–1655. http://doi.org/10.1257/000282802762024700