jhelvy / logitr

Fast estimation of multinomial (MNL) and mixed logit (MXL) models in R with "Preference" space or "Willingness-to-pay" (WTP) space utility parameterizations in R
https://jhelvy.github.io/logitr/
Other
42 stars 15 forks source link

Is there a requirement of the amount of the data for the logitR function to predict? #45

Closed midsummernightbreeze closed 1 year ago

midsummernightbreeze commented 1 year ago

Hello, I'm using the logitr function trying to calculate the WTP value of each user. Each user has around 20 observation sessions for both WTP and WTA values. However, the predicted model contains very large numbers and negative numbers which seems not right. I compare the results of the preference space model and the WTP space model, some of the user's results match but some do not. This means the result of the wtp space model is not the global solution. But even the result of the preference model is not right, so in this case, how could I solve this problem?

Thanks a lot!!

midsummernightbreeze commented 1 year ago

And is it alright that if it produces negative values when the dataset contains no negative numbers? Thanks!

jhelvy commented 1 year ago

Hi, so if I'm understanding this correctly, you are separating your data into subsets by user, and then estimating separate models for each user? Is that correct?

If so, then one important issue is you likely won't be reaching a global solution for many of the WTP space models. WTP space models have a non-convex log-likelihood function, so you need to at least run a multistart to see if you can find a better solution. Set numMultiStarts = 10 or more, like in this example. That should help, but it is also not guaranteed to arrive at a global solution. For some individuals, you may have to run a lot more, perhaps 100 or more.

But perhaps more importantly, this isn't a recommended approach for obtaining individual WTPs. A hierarchical model is probably a better approach. For example, you could estimate a mixed logit model on the entire sample, and then compute the individual WTPs from those results. I have not yet added code to make these computations (there's an open issue on this), but the math to do it is readily available (see that issue for details).

I'm not sure what you mean by the preference space models aren't right. Again, I wouldn't recommend estimating individual models on every person in your sample, especially for preference space models since the resulting coefficients can't be directly compared across models due to potential differences in error scaling.

midsummernightbreeze commented 1 year ago

Thanks a lot for your quick reply! I've set the numMultistarts at 100 but just as you said, this approach is unsuitable for obtaining individual WTPs. I'll see if the hierarchical model is applicable for my current data set. I'll check the issue you mentioned and see if I can fix the problem...