jhelvy / cbcTools

An R package with tools for designing choice based conjoint (cbc) survey experiments and conducting power analyses
https://jhelvy.github.io/cbcTools/
Other
5 stars 5 forks source link

Db Efficient vs D efficient designs #23

Closed jrob95 closed 10 months ago

jrob95 commented 12 months ago

Hi there - this is an awesome package and I've found it very easy to use however, just seeking some clarification based on my limited experience with other software packages. It was my understanding that you should specify point estimate priors for D-efficient designs, and specify prior distributions for bayesian d-efficient. However, from what I can tell, in this package you specify no priors for d-efficient designs and point estimates for Db efficient designs. Can you speak to this at all?

Thanks, Jack

jhelvy commented 12 months ago

Hi Jack, this is indeed an area where I myself have found the literature on experiment design quite inconsistent and confusing. Different scholars use these terms to mean different things.

My understanding is that specifying any prior at all is still referred to as a DB-efficient design as you are (in a very Bayesian way of thinking) obtaining a design that is still informed by a prior assumption about the true parameters. The design methods that support the use of priors use the {idefix} package to obtain designs. That paper describes in much more detail how this approach works. My package is simply a wrapper around this to simplify things and make the user interface a little easier to understand. You can find-tune more options by using {idefix} directly, but I find the package a little challenging to use. My package just uses point estimates for priors with a diagonal variance-covariance matrix for the prior. You can see how the draws for the priors are generated here, which is exactly how they are generated in the example in the {idefix} JSS article. I suppose I could allow users more flexibility by allowing them to also specify sigma...maybe I'll do that in the next version.

Note that this is different from what I'm calling a "d-optimal" design, which maximizes the D-efficiency of a linear model. It's confusing because this approach still seeks to maximize the design "efficiency", so many scholars refer to this strategy as a "D-efficient" design. But for linear models the information matrix (which determines the efficiency) does not depend on the model parameters, so there's no way to specify a prior. This is why I use "d-optimal" instead of "d-efficient" for this method because I'm trying to use the word "efficient" to refer to approaches that use a prior.

I've drafted a JSS article myself on this package, and I'm hoping to get some clear feedback on terminology in the review process that will hopefully clarify all of this. That alone might be a useful contribution of a paper on {cbcTools}!

mmardehali commented 12 months ago

Hi Jack!

I'd like to first echo what you said about cbcTools: in my experience, this package is far easier to work with and understand than idefix, specially for a newcomer. In fact, I think I can say that when I first used idefix, I absolutely couldn't figure out what goes where and how to perform certain tasks. cbcTools was a lifesaver that I came across practically by chance (Dr. Helveston's response on ResearchGate to another scholar wondering how to deal with idefix!). I spent a lot of time figuring stuff out, and below I'd like to share some of my findings with you (and others who may come across this great package at some point). At the end, I have my own question for Dr. Helveston, which is also relates to your original question. This is lengthy, but hopefully it can serve as a resource in some capacity.

Understanding idefix: I came across your question yesterday, looking for an answer for my own question. I typed a response, but decided against posting it because I'm also relatively a newcomer to this space! I'm glad that Dr. Helveston's response confirms what I had in mind. I also think that the issue is mostly unstandardized terminology. Before seeing this question yesterday, I was reading Dr. Traets' paper on idefix (the same paper referenced in Dr. Helveston's comment) to answer my own question, and based on the on the wording of your question I guessed that you are probably specifically referring to the first paragraph of section 2.2. Quantifying Prior Information, in which Dr. Traets uses language that may be interpreted as: "If you identify point estimates for the priors, and don't use a distribution to identify them, you are not creating a DB-Efficient design. Instead, you are creating an efficient design based on minimizing D-Error." I think this interpretation is not accurate, and in both cases you ARE creating a DB-Efficient design.

DB-Efficiency, DZ-Efficiency, and cbcTools: From my understanding of the literature I reviewed on this topic, and on the topic of Stated Preferences methods (Discrete Choice Experiments in particular), I believe I can second Dr. Helveston's statement above. My understanding is, the moment you involve priors into creating your design, you are in fact in Bayesian territory, and are creating a DB-Efficient design. Now in the CBC package, based on the documentation, if you don't identify point estimates for the priors, and use "CEA" or "Modfed" for your optimization method, the package assumes you are creating a Bayesian design with Zero priors. This design is referred to in the literature as DZ-Efficient (in which the Z stands for Zero, denoting an assumption of zero for the priors). For more information, see: Szinay et al. 2021, and Lancsar & Louviere 2012

The Issue of Terminology: This issue of non-standard terminologies and approaches seems to plague this field (see the ISPOR best practices report). I mean as far as I understand, there is even an ongoing controversy over whether Discrete Choice Experiments are the same as Conjoint Analysis! There is a paper literally entitled "Discrete Choice Experiments Are Not Conjoint Analysis" by the INVENTOR of Discrete Choice Experiments (and the inventor of Best-Worst Scaling and MaxDiff, the late Dr. Jordan Louviere). Yet, I very commonly see DCE and Conjoint Analysis used interchangeably in the literature. Another area of discrepancy that I have been able to identify is the metric for optimization. I have seen D-Error used as a metric (which is what idefix and cbcTools are using), and I have also seen D-Efficiency alternatively reported in the literature (as a value between 0 and 1, but unlike D-Error, the goal of optimization is to maximize this value). My point is: different sources may refer to these experiments, methodologies, and optimization techniques using different terminology. For a newcomer to this field, this can be extremely jarring and may even lead to inaccurate research design.

My Experience With cbcTools And Zero Priors: From a practical perspective, I can report that I have used cbcTools for a DZ-Efficient DCE pilot study, and after data collection from only 18 participants, I achieved very high levels of significance for almost all attributes (2 categorical, and 2 numerical). The only attribute (categorical) that did not achieve significance, was an attribute that based on other evidence, the participants actually didn't care about in their decision making. I would also like to point out that power analysis in this space is also subject to some controversy. A "rule of thumb" by Lancsar & Louviere suggests only about 20 participants per block is enough to estimate a reliable model (meaning that my pilot study, at least according to this rule of thumb, can be considered my main study because I had only 1 block and 18 participants!). Another formula by Johnson & Omre suggests that I need about 83 participants to achieve significant results (but I already have, with 18, and zero priors!). Of course, different studies will vary in how many participants they require for reliable model estimation. That's were cbcTools offers another great tool for simulating power analysis and determining the sample size with more reliability.

A Word of Caution About Zero Priors: I would caution, however, that if you are using zero priors, pay specific attention to the balance of your design, especially when it comes to numerical variables. In my pilot study, I have two numerical variables: Cost (20, 25, and 30), and User Rating (3.2 stars, 4.0 stars, and 4.8 stars). Because I used zero as my priors, and considering the range of the values for those attributes, the final design almost entirely excluded the 4.0 stars rating. My understanding is this happens because I didn't specify any priors, and the optimization process sees the difference between 20, 25, and 30 to be way more important than the difference between 3.2, 4.0, and 4.8, even though in a real-world application, those differences are not the same. My workaround was that I introduced the User Ratings as 20, 25, and 30 to cbc_design, so it perceives the differences and ranges to be the same, and then in the output I manually translated those to their correct User Rating values (20 -> 3.2 stars, and so on). Using this workaround, the resulting design was way more balanced. You can read more about this specific challenge here #17 . Also, when simulating power analysis for a DZ-Efficient design, my understanding is because no priors are specified (or alternatively: priors are specified to be zero), cbc_choices essentially fills out the "choice" column by random. This means that the results of the power analysis simulation based on these simulated choices is very conservative. In my case, the simulation estimated that by including about 150 participants I'll be able to bring the standard error of the estimated coefficients into an acceptable range. But in reality, I achieved significant coefficient estimates with just 18 participants. That's because (presumably) the simulation didn't know what actual choices will look like. I think I'll be seeing very different estimations when I include my priors from the pilot into the design of the main study.

My Question: I'm still looking for an answer to this question: what is the default distribution assumed for prior point estimates? I was asked this question when I was presenting my pilot findings to my team, and I didn't have an answer. At this point, I'm moving to design the main study based on the priors I now have (from the pilot). I know I can use the Betas from my pilot in cbcTools as priors, but I don't know the default distribution type (normal, uniform, etc.) and its variance, if applicable. I can see in Dr. Helveston's response above, the reference to sigma and par_draws (which are exactly what idefix is using). But I have difficulty interpreting what they mean, and what the bottom line is in terms of the shape of the distribution and its variance. I'd really appreciate a clear explanation in "newcomer" terms :)

jhelvy commented 12 months ago

As linked to above, the code here shows how the priors are generated. For now, I'm only allowing users to provide point estimates (mu in the code below). These form the means of a multivariate normal distribution used for the priors. The variance-covariance matrix of that multivariate normal distribution is simply a diagonal matrix of 1s:

mu <- c(1, 2, 3)
n_draws <- 10^4
sigma <- diag(length(mu))
par_draws <- MASS::mvrnorm(n = n_draws, mu = mu, Sigma = sigma)

You can see that the marginal distributions of each parameter have means at mu and standard deviations of 1:

apply(par_draws, 2, mean)

#> [1] 1.001323 2.010238 2.986693

apply(par_draws, 2, sd)

#> [1] 1.0005287 0.9953999 0.9977950

This is obviously rather simplistic and does not give the user as much flexibility. A better approach would simply be to add sigma as yet another argument so that the user could have more control over these priors, especially if they expected correlations between different attributes (the current approach assumes the off-diagonals of sigma are all 0).

jhelvy commented 12 months ago

On a separate note @mmardehali I'm curious about your result of achieving significant effects with just 18 respondents. One major reason I made this package was to provide some way of estimating required sample sizes. I believe the default approach I use where choices are random leads to (as you suggested) rather conservative estimates, probably because there is so much noise in the random choices. Providing priors to cbc_choices would probably help quite a bit in obtaining smaller required sample sizes, though of course if your priors are wrong then the required sample size estimate might be misleading.

In any case, there's a pretty huge difference between 18 and 150. I'd like to think more about this to understand why. My immediate thought is that I simply have far too much variance in the choice responses in how I'm simulating them compared to how people might actually make choices, leading to higher estimated errors.

mmardehali commented 11 months ago

Thank you Dr. Helveston for the clarification! I will open a separate issue to discuss the topic of cbc_choices and cbc_power.

Question: Based on what you demonstrated above, is it accurate if I report the following about my design? "The prior point estimates are each translated into a prior normal distribution by the code, with each point estimate as the mean and a standard deviation of 1." Alternatively: Prior Distribution = N(Prior Point Estimate, 1)

Suggestion: If you are considering adding the option to specify sigma, I'd like to just point out that one of the reasons I ran into difficulties using {idefix}, was precisely this! I couldn't decipher (based on the paper, documentation, and the syntax itself) what these parameters are in {idefix} and how they should be identified. Obviously, this may not be an issue for the veterans of the field. I'd like to suggest that maybe cbcTools can remedy the (in my opinion) unnecessary complication by using a more user-friendly approach. For example, the user can either specify point estimates for priors and just let the code create the default distribution, or they can identify the shape and parameters of the distribution before using cbc_design:

CostPriorDist <- Distribution(type = "normal", mean = -0.13, sd = 0.2)
UserRatingsPriorDist <- Distribution(type = "uniform", mean = 3.2, sd = 0.9)
InformationPriorsDist <- c(Low = Distribution(type = "normal", mean = -1.8, sd = 0.45), 
                           Medium = Distribution(type = "normal", mean = 1.9, sd = 0.32), 
                           High = Distribution(type = "normal", mean = 5.2, sd = 0.29)
                           )
Design <- cbc_design(
  ...
  priors = list(
    Cost = CostPriorDist ,
    UserRatings = UserRatingsPriorDist,
    Information = InformationPriorsDist 
  ),
  ...
)

Maybe using dnorm() or dunif() instead of creating a new Distribution() function would be a better decision. But I think since we're not trying to plot a distribution or anything, maybe just identifying the shape and parameters of the distribution and using that to create sigma under the hood could be enough.

Another alternative, which in my opinion is even more elegant and promotes the use of {logitr} as well, is to create sigma directly based on the output of logitr(), since all these parameters are already determined there:

PilotModel_MNL <- logitr(
  data = ChoiceData,
  outcome = "choice",
  obsID = "obsID",
  pars = c("Cost", "UserRatings", "Information"),
  return_as_priors = TRUE #<=====
)

Design <- cbc_design(
  ...
  priors = PilotModel_MNL.priors() #<=====
  ...
)

Admittedly, I know very little about how much headache the implementation of any of these ideas will cause! I'm just hoping these suggestions can be helpful in improving the usability and comprehensiveness of these tools even further.

jrob95 commented 10 months ago

Thank you both for this robust and insightful discussion. I'll preface this by saying that I am only designing my first DCE, and the training I have is using nGene, which is from the Bleimer, Rose, Henscher camp in term of design (seen in their manual here, chapter 7) About , and that's where my use of terminology distinguishing Dz, d error assuming priors are zero, Dp d error assuming point estimate priors, Db, drawing from the distribution, as discussed previously.

I think in this case, the field needs to mature a bit more in the use of terminology considering the divergence here. I apologise for the delay in my response, as I'm still new to this field I'm still trying to get my head around it, and develop the language to describe what I mean to say.

Thanks very much both, Jack