Fast estimation of multinomial (MNL) and mixed logit (MXL) models in R with "Preference" space or "Willingness-to-pay" (WTP) space utility parameterizations in R
For experiments with outside goods ("none" options), the data need to be encoded in a particular way. I frequently see people make mistakes with this, so it's probably worth writing a function that handles this encoding for them. It needs to handle the following two conditions:
For continuous variables that don't have a 0 in them already (e.g. price), you should also subtract off the lowest value from all the values. By doing this, the value of 0 now means something (e.g. for price, it would be the lowest price), and everything different from 0 refers to the difference from the lowest value. If you don't do this, then the 0s in attributes like price are essentially saying the alternative had a price of 0, which is not correct.
For categorical variables, it is best to also manually dummy-code them and insert those dummy-coded variables into pars. Then you would also create a dummy-coded "no choice" column that is also separately included in pars. This way you'll get a separate coefficient for the "no choice" option that isn't conflated with the other categorical variables (e.g. brand in the example yogurt data).
For experiments with outside goods ("none" options), the data need to be encoded in a particular way. I frequently see people make mistakes with this, so it's probably worth writing a function that handles this encoding for them. It needs to handle the following two conditions:
pars
. Then you would also create a dummy-coded "no choice" column that is also separately included inpars
. This way you'll get a separate coefficient for the "no choice" option that isn't conflated with the other categorical variables (e.g.brand
in the exampleyogurt
data).