Using logitr to estimate heterogeneity in customers time-preferences

mwussow commented 2 years ago

Dear Prof. Helveston,

thank you for this great package, its speed and ease of use are very impressive! I am trying to use it to fit a (mixed) logit model to analyze how customers trade-off upfront investment costs versus future savings in renewable energy investments. Specifically, I am follwoing Train, 1985 and seek to estimate a random utility model for the estimation of implicit discount rates with the following specification:

U_i = α_i + β₁ upfrontCost_i + β₂ futureSavings_i + ε_i

I am interested in the ratio β₁/β₂, particularly in its distribution within the sample. Your WTP model space seemed particularly suited for this estimation as it allows to normalize β₂ to one and interpret the coefficient associeted with β₁ as the parameter of interest.

In a standard logit specification this works very well. However, when I try to estimate a mixed logit model with normal or log-normal parameters, the sigma parameter is estimated to be close or equal to zero. This would indicate that there is no heterogeneity in customer's time preferences within the sample. However, I know that this cant be true. In fact, when I interact β₂ with other socioeconomic variables, I can capture part of this heterogeneity with a standard logit model.

I already tried increasing numMultiStarts to 100,which did not help. I suspect that I might have misspecified my model in a subtle way. Do you have any thoughts on what I may be doing wrong or have suggestions for further reading/ model implementation advice?

I am looking forward to hearing from you.

Thanks in advance and best wishes, Moritz

EDIT: I guess it is worth noting that the upfrontCost and futureSavings of the alternatives vary across individuals. Maybe that is violating a model assumption.

jhelvy commented 2 years ago

Hi Moritz, thanks for your kind words! Can you tell me a bit more about how you're specifying the model? Are you estimating a WTP space model? It might help if you pasted in your code so I could see how you're specifying it.

I'm having a difficult time understanding how a WTP space model would work here. For WTP space models, you need a "price" attribute that everything is normalized to. Is the upfrontCost variable the price? If so, then I believe the WTP space model would look like this:

U_i = λ_i (ω1 * futureSavings_i - upfrontCost_i ) + ε_i

This seems a bit strange as ω1 is the WTP for futureSavings. But perhaps that's what you're after? That is, rather than using the words "willingness to pay," you could interpret it as the "value" of those future savings?

mwussow commented 2 years ago

Thank you for your prompt reply! Here is a commented example of my modelling approach: https://rpubs.com/mwussow/904254

As shown:

preference and WTP space produce the same parameter estimates (as expected)
paramter of interest is distributed in the sample (as shown by interaction model)
mixed logit in both preference and WTP space fail to capture heterogeneity of parameters

My best guess is that it might be related to the fact that the attributes of the alternatives vary over individuals (could this violate a model assumption?)

jhelvy commented 2 years ago

This all looks correct to me. The code appears to be correctly implemented. The only potential errors I see are:

obs_ID starts at 0. This shouldn't matter, but it's probably better to have it start at 1.
In 4.2 (Mixed-logit, WTP space) you have randPars = c(invest = 'n', invest = 'n'). Should be randPars = c(invest = 'n').

Pretty sure neither matter though.

In terms of why the sigma terms are small in the mixed logit models, that doesn't seem that odd to me. The interaction model you have shows an insignificant interaction effect:

## Model Coefficients: 
##               Estimate Std. Error z-value  Pr(>|z|)    
## invest         11.3238     1.4386  7.8716 3.553e-15 ***
## savings       152.5849    18.7007  8.1593 4.441e-16 ***
## savings_group  33.4926    31.2658  1.0712    0.2841    
## invest_group   -1.1794     2.1722 -0.5429    0.5872

The standard errors on the interaction terms are huge, so the could just as well be zero, meaning there's no measurable difference between these groups. So it's not surprising to me that the mixed logit models suggest that there is little heterogeneity. The log-likelihood values are all the same too, so it looks like they're all converging to the same solutions.

mwussow commented 2 years ago

Thank you for pointing this out! I am puzzled how I could have missed it. I think this explains the model results.

jhelvy / logitr

Using logitr to estimate heterogeneity in customers time-preferences #28