arteagac / xlogit

A Python package for GPU-accelerated estimation of mixed logit models.
https://xlogit.readthedocs.io
GNU General Public License v3.0
52 stars 22 forks source link

Estimating random utility models from Bruhin et al. (2019) #10

Open armoutihansen opened 2 years ago

armoutihansen commented 2 years ago

First of all: Thanks a lot for this very useful package!

I was wondering whether it would be possible to estimate the random utility models in Bruhin et al. (2019) as mixed logits using xlogit? The authors already estimate the utility models as standard logits and finite mixtures. Specifically, they estimate simple social preference models based on a panel of binary dictator and reciprocity games. In a binary dictator game, the subject is randomly matched with another subject and has to choose between two allocations. The utility of a given allocation is given by:

image

In a binary reciprocity game, the subject is faced with the same decision, but the matched subject performs either a kind or unkind decision beforehand. Hence, the utility here is given by:

Screenshot 2022-08-01 at 19 33 58

Based on this, the probability of choosing one allocation over the other is given by:

Screenshot 2022-08-02 at 10 35 49

It is not clear to me whether such a functional form, both as standard logit and mixed logit, can be estimated by xlogit.

Many thanks and best, Jesper

arteagac commented 2 years ago

Hi @armoutihansen! My understanding is that your goal is to estimate the parameters $\alpha$ and $\beta$. If this is the case, perhaps you can re frame the utility specification as shown below (the same can be done for the utility in Equation (2)). The utility specification below could be used in xlogit to estimate a Mixed Logit model and select random parameters for $\alpha$ and $\beta$.

$U^A = (1 - \alpha s - \beta r)\prod{^A} + (\alpha s + \beta r )\prod{^B} $ $U^A = \prod{^A} - \prod{^A}\alpha s - \prod{^A}\beta r + \prod{^B}\alpha s + \prod{^B}\beta r $ $U^A = \prod{^A} + \alpha (-\prod{^A} s + \prod{^B} s) + \beta(- \prod{^A} r + \prod{^B} r) $

Let me know if this utility specification is valid, and I can further guide you on how to estimate it in xlogit.

armoutihansen commented 2 years ago

Hi @arteagac! Thanks a lot for the quick response.

That is indeed correct. The goal would be to estimate $(\alpha,\beta)$ in

$$U^A=\Pi^A+\alpha(\Pi^B s - \Pi^A s)+\beta((\Pi^B r - \Pi^A r)$$

in the first specification as well as $(\alpha,\beta,\gamma,\delta)$ in

$$U^A=\Pi^A+\alpha(\Pi^B s - \Pi^A s)+\beta((\Pi^B r - \Pi^A r) + \gamma((\Pi^B q - \Pi^A q) + \delta((\Pi^B v - \Pi^A v)$$

in the second. Furthermore, for both of these, the goal would also be to estimate the choice sensitivity/scale parameter $\sigma$ given in (4) above.

arteagac commented 2 years ago

Hi @armoutihansen. Great! It seems that xlogit can indeed help you with this estimation. You simply need to prepare a dataset that pre-computes $s(\prod{^B} - \prod{^A})$, $r(\prod{^B} - \prod{^A})$, $q(\prod{^B} - \prod{^A})$, and $v(\prod{^B} - \prod{^A})$, and then you can use these as input columns inxlogit to estimate the model (see estimation examples here). If the examples are not clear or if you need any further help, please do not hesitate to let me know For the sensitivity/scale parameter $\sigma$, I think you can frame it as a WTP-type of problem, in which the willingness to pay is modeled using a scale factor (usually the negative of the price), which makes the problem non-linear. Fortunately, xlogit also supports this type of estimation. For this, you simply need to pass scale_factor as an argument to the fit function.

armoutihansen commented 2 years ago

Hi @arteagac. Thank you very much for the help!

For the standard (multinomial) logit, I managed to reproduced the results of Bruhin et al. (2019):

image

In the output above, $(x_1,\dots,x_4)=(s(\Pi^B-\Pi^A),\dots,v(\Pi^B-\Pi^A))$ as you suggested in your previous reply. However, the signs of $(x_1,\dots,x_4)$ are opposite that of Bruhin et al. I tried setting scale_factor=-df_long['pi'] to add instead of subtract, but this returned different parameter estimates and a much lower logik. Do you know how to deal with this? Also: It is not (yet) possible to cluster the standard errors, right?

For the mixed logit, I did not manage to get that encouraging results:

image

It is not clear to me why the loglik is so much lower than that of the multinomial logit. Also, if I omit the scale factor, I achieve almost half as low a loglik. I am wondering whether this is due to the scale factor not being random in the current estimation. I saw that this point was discussed in your WTP issue, but I couldn't find any reference to randomising the scale factor in the documentation. Is is correct that it is not (yet) possible to let the scale factor be random?

Again, thank you very much for the help!

arteagac commented 2 years ago

Hi @armoutihansen, I am not sure what might be causing the flip in signs compared to Bruhin et.al., but I assume it might be some minor issue when you pre-processed x1, x2, x3, and x4. Make sure you did not flip any operations during the data preparation. Regarding clustered errors, unfortunately xlogit still does not support those.

Regarding your mixed logit model, I think there is something off because the log-likelihood is extremely low. This is a potential symptom of non-convergence, which is non-surprising for this type of non-linear WTP-like models. You are right, xlogit still does not support a random scale factor, but I think this might not be the cause of the issue. I would advise to try running multiple estimations using different start points by passing the to the init_coeff argument in the fit function. Also, if for your case it is critical to use a random scale_factor, perhaps you can take a look at the logitr R package (https://jhelvy.github.io/logitr/), which supports this feature.

armoutihansen commented 2 years ago

Okay. Thanks a lot! I will give logitr a try for the clustered standard errors and random scale factor.

arteagac commented 2 years ago

Sure, please let me know how it goes.

armoutihansen commented 2 years ago

Hi @arteagac, I just wanted to quickly update you on my progress.

First of all, I realised that I do not need to specify my model in WTP space in order to estimate the scale parameter. In particular, I can just estimate:

$$\frac{1}{\sigma}\Pi^A +\frac{\alpha}{\sigma}(s\Pi^B-s\Pi^A)+\frac{\beta}{\sigma}(r\Pi^B-r\Pi^A)+\frac{\gamma}{\sigma}(q\Pi^B-q\Pi^A)+\frac{\delta}{\sigma}(v\Pi^B-v\Pi^A)$$

and then multiply $\alpha,\beta,\gamma,\delta$ by $\sigma$ after estimation. Naturally, $\frac{1}{\sigma}$ should be positive, so I specify a truncated normal (I assume that this is identical to the zero censured normal in logitr. Correct me if I am wrong) for that and a normal distribution for the other parameters. Based on this, I was able to achieve convergence in logitr, but not xlogit.

logitr:

image

Log-likelihood: -3,819.24

xlogit:

image

In the outputs above, selfis $\frac{1}{\sigma}$, so it is not clear to me why it is negative when I specify tn, but perhaps I have misunderstood how the truncated normal is used? If I specify it as log-normally distributed, I am far from convergence with a log-likelihood of approximately -50,000.

arteagac commented 2 years ago

Hi @armoutihansen, Thanks a lot for the update. Your feedback helps me to keep improving xlogit. I took a look at logitr's censored normal implementation and it seems that it is the same as xlogit's truncated normal (I will double-check if it is better to rename tn as cn in xlogit to keep consistency with logitr). Technically, both should provide the same results, but I am not sure why xlogit fails to converge. I know logitr uses by default the L-BFGS-B optimization routine, whereas xlogit uses by default BFGS. Perhaps you can try setting xlogit to use L-BFGS-B by using the optim_method='L-BFGS-B' parameter in the fit method. L-BFGS-B is more robust, so perhaps this can help with the convergence issue.