Closed Ales-G closed 2 years ago
Thanks @Ales-G! Yes, I think there's an error in the computation of the clustered errors in logitr. I'm looking into the source of the problem, but this is really helpful as it gives me an example to compare against. I'll follow up as I search, but for now I would use the Stata results.
Alright, I've narrowed the problem down. It has to do with scaling the data. By default the scaleInputs
argument is set to TRUE
(this helps make the optimization more stable). But somewhere in one of my updates I introduced an error where the errors are not correctly re-scaled post-estimation. So if you just set scaleInputs = FALSE
you should get the correct standard errors:
mnl_pref_yog <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
clusterID = "id",
scaleInputs = FALSE
)
coef(summary(mnl_pref_yog))
Estimate Std. Error z-value Pr(>|z|)
price -0.3666739 0.05423654 -6.760642 1.373812e-11
feat 0.4916524 0.19390366 2.535550 1.122709e-02
brandhiland -3.7156066 0.35365213 -10.506388 0.000000e+00
brandweight -0.6411729 0.44725879 -1.433561 1.516975e-01
brandyoplait 0.7349638 0.27721575 2.651234 8.019827e-03
I'm working on a patch to fix this.
Okay, I think I fixed this. If you re-install the package from github and try the same model you should get the same results:
Install the dev version of the package (0.5.1):
# install.packages("remotes")
remotes::install_github("jhelvy/logitr")
Estimate the model
mnl_pref_yog <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
clusterID = "id"
)
coef(summary(mnl_pref_yog))
Estimate Std. Error z-value Pr(>|z|)
price -0.3665546 0.05422209 -6.760244 1.377587e-11
feat 0.4914392 0.19388663 2.534673 1.125523e-02
brandhiland -3.7154773 0.35363731 -10.506463 0.000000e+00
brandweight -0.6411384 0.44722326 -1.433598 1.516870e-01
brandyoplait 0.7345195 0.27720045 2.649777 8.054482e-03
Dear Jhelvi, thanks a lot for looking into this, indeed the clustering now works using the multinomial logit. However, I have tried running a mixed logit and the standard errors still appear to be implausibly large. Is it possible that there is still a problem with clustering in the mixed logit?
mxl_pref_yog <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
randPars = c(price="ln",feat="n",brand="n"),
clusterID = "id"
)
summary(mxl_pref_yog)
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
price_mu -7.2574e-01 1.6353e+04 0.0000 1.0000
feat_mu 1.1882e+00 3.4152e+01 0.0348 0.9722
brandhiland_mu -3.1898e+00 5.6172e+01 -0.0568 0.9547
brandweight_mu -3.2192e+00 8.7247e+01 -0.0369 0.9706
brandyoplait_mu -1.1540e-01 9.0434e+00 -0.0128 0.9898
price_sigma 6.9807e-03 1.8764e+03 0.0000 1.0000
feat_sigma -3.4040e+00 7.5582e+01 -0.0450 0.9641
brandhiland_sigma -7.0209e-01 7.0273e+01 -0.0100 0.9920
brandweight_sigma 5.2986e+00 1.3624e+02 0.0389 0.9690
brandyoplait_sigma 8.0490e-01 2.3802e+02 0.0034 0.9973
Oh no! I thought I had that fixed too...I'll go check some more.
Okay, I looked at things again and I believe the code is all correct still.
For this specific example, the errors are high because the algorithm has reached a local minimum. I can tell because the log-likelihood is ~2787. This often happens when using log-normally distributed parameters. If you, for example, change the price parameter to a normal distribution, I get a solution where the log-likelihood is ~2640. Here's what I get:
mxl_pref_yog <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
randPars = c(price="n", feat="n", brand="n"),
clusterID = "id",
numMultiStarts = 10
)
coef(summary(mxl_pref_yog))
Estimate Std. Error z-value Pr(>|z|)
price_mu -0.4674854 0.1357989 -3.4424816 0.0005764031
feat_mu 0.6522535 0.3156426 2.0664302 0.0387878780
brandhiland_mu -4.3428303 1.0188227 -4.2625967 0.0000202065
brandweight_mu -1.6754820 1.1016610 -1.5208690 0.1282927094
brandyoplait_mu 0.9453728 0.3261557 2.8985321 0.0037491389
price_sigma -0.1311470 0.2213794 -0.5924084 0.5535771241
feat_sigma 2.4859074 0.6583294 3.7760843 0.0001593131
brandhiland_sigma -0.1492186 0.4145413 -0.3599607 0.7188765041
brandweight_sigma -2.2336045 1.2324079 -1.8123906 0.0699258585
brandyoplait_sigma 0.1550451 0.1818241 0.8527203 0.3938144317
Notice too that I added a numMultiStarts = 10
argument. This will run the algorithm 10 times from 10 sets of random starting points. This helps avoid local minimum solutions. When using log-normal parameters, sometimes I have to search for 100 multistarts or more to avoid local minima. If a better solution isn't being reached by then, chances are the chosen distribution just isn't a good fit for the data.
Dear Helvy, thanks a lot for clarifying this, this makes sense. I thought the problem was still related to clustering but obviously, it is not the case! Just as a suggestion, what values of the log likelihood should ring a bell and indicated that probably I have reached a local minimum?
Thanks again for solving this so quickly!
No problem, and thanks for catching this!
If you don't mind, could you run some similar models in Stata to double check the solutions? I don't have a license so I can't make the comparison. I believe there are mixed logit models for Stata. Note that for now the mixed logit models in logitr have uncorrelated heterogeneity, so it assumes a diagonal covariance matrix for the mixed terms (off-diagonals are 0). I working on supporting correlation.
So, here are the results from logitr using a log normal distribution for the price parameter.
mxl_pref_yog_ln <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
randPars = c(price="ln",feat="n",brand="n"),
modelSpace = "pref",
clusterID = "id",
numMultiStarts = 10
)
summary(mxl_pref_yog_ln)
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
price_mu -0.523603 473.733223 -0.0011 0.9991
feat_mu 1.564994 29.462348 0.0531 0.9576
brandhiland_mu -2.888755 4.148376 -0.6964 0.4862
brandweight_mu -3.619802 46.385471 -0.0780 0.9378
brandyoplait_mu -0.614410 73.998085 -0.0083 0.9934
price_sigma 0.020436 183.291541 0.0001 0.9999
feat_sigma 3.239503 42.335816 0.0765 0.9390
brandhiland_sigma 0.236571 88.789261 0.0027 0.9979
brandweight_sigma -5.087253 58.426740 -0.0871 0.9306
brandyoplait_sigma -1.316474 194.592629 -0.0068 0.9946
The results of estimating the same model in Stata appear to be different.
use "yogurt.dta", clear
encode brand, gen (en_brand)
mixlogit choice, rand(feat en_brand price) ln(1) nrep(10) group(obsID) id(id) cluster(id)
Mixed logit model Number of obs = 9,648
Wald chi2(3) = 69.60
Log likelihood = -2023.0267 Prob > chi2 = 0.0000
(Std. err. adjusted for 100 clusters in id)
------------------------------------------------------------------------------
| Robust
choice | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
Mean |
feat | 1.165762 .2269738 5.14 0.000 .7209012 1.610622
en_brand | -.0425926 .0856924 -0.50 0.619 -.2105465 .1253613
price | -2.807109 .5081851 -5.52 0.000 -3.803133 -1.811084
-------------+----------------------------------------------------------------
SD |
feat | .0935602 .2438872 0.38 0.701 -.3844501 .5715704
en_brand | 1.6778 .2386911 7.03 0.000 1.209974 2.145626
price | -1.864972 .3684935 -5.06 0.000 -2.587206 -1.142738
------------------------------------------------------------------------------
The sign of the estimated standard deviations is irrelevant: interpret them as
being positive
Note that I am not using the option corr
in Stata that allows correlated heterogeneity.
It appears that there are two issues: first the estimated coefficients do not coincide, second cluster standard errors are much narrower in Stata.
I have also tried estimating the two models with normally distributed parameters and without clustered standard errors. While the SEs problem appears to go away, estimated coefficients still appear to be very different.
*STATA
mixlogit choice, rand(feat en_brand price) nrep(10) group(obsID) id(id)
Mixed logit model Number of obs = 9,648
LR chi2(3) = 2338.09
Log likelihood = -2054.4019 Prob > chi2 = 0.0000
------------------------------------------------------------------------------
choice | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
Mean |
feat | 1.049358 .1505567 6.97 0.000 .7542725 1.344444
en_brand | .1127672 .0417572 2.70 0.007 .0309246 .1946098
price | .0691454 .0236047 2.93 0.003 .022881 .1154097
-------------+----------------------------------------------------------------
SD |
feat | -.1293079 .3007661 -0.43 0.667 -.7187987 .4601829
en_brand | 1.296878 .0485107 26.73 0.000 1.201799 1.391957
price | .4603082 .0293819 15.67 0.000 .4027208 .5178956
------------------------------------------------------------------------------
The sign of the estimated standard deviations is irrelevant: interpret them as
being positive
# R
mxl_pref_yog <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
pars = c("price", "feat", "brand"),
randPars = c(price="n",feat="n",brand="n"),
modelSpace = "pref",
numMultiStarts = 10
)
summary(mxl_pref_yog)
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
price_mu -0.468283 0.048805 -9.5949 < 2.2e-16 ***
feat_mu 0.650771 0.232704 2.7966 0.0051650 **
brandhiland_mu -4.356656 0.403941 -10.7854 < 2.2e-16 ***
brandweight_mu -1.678958 0.456975 -3.6741 0.0002387 ***
brandyoplait_mu 0.947794 0.143627 6.5990 4.140e-11 ***
price_sigma -0.131491 0.091383 -1.4389 0.1501782
feat_sigma 2.482720 0.536940 4.6238 3.767e-06 ***
brandhiland_sigma -0.186607 0.859620 -0.2171 0.8281458
brandweight_sigma -2.239907 0.692509 -3.2345 0.0012186 **
brandyoplait_sigma 0.159572 0.431990 0.3694 0.7118383
Am I doing something wrong?
In any case, your package is much quicker than Stata which is great.
Okay, thanks for these comparisons - this is super helpful! But I'm not sure if these are apples-to-apples comparisons.
First, what is the en_brand
parameter in the Stata models? There are 4 brands, so there should be 3 mean and 3 SD parameters - one for each brand, with one brand set as the reference level. In the logitr models, you see these coefficients for each brand, with dannon set as the reference level.
Second, I forgot to mention that since this data set has a panel structure, so we should have been including panelID = 'id'
in every model. (I actually need to update my documentation for this as the examples I show should include this). I'm not sure if the panel structure is being included in the Stata models you ran?
Finally, when you run these models, can you include the log-likelihood values too in the results? That's a good quick check to see if the differences are just due to the models reaching different solutions, which is certainly possible. In general I think Stata reaches a consistent solution more frequently than logitr, but that's why I built in a numMultiStarts
feature. It's also why I focused on optimizing the package for speed so that you can estimate 100 or so models pretty quickly. This is especially important for WTP space models which tend to diverge more often.
First of all thanks for pointing out the fact about the panelID, this is indeed very helpful in using logitr. In Stata I was specifying that the dataset was a panel using the id(id)
option.
Second apologies for the estimation in Stata before, as you correctly spotted, I was not including the brand as a dummy. I run everything quickly and failed to spot the mistake.
This is the correct estimation in Stata: Note that mixlogit requires to multiply the price coefficient by -1 because "log-normally distributed coefficients should have positive coefficients in the conditional logit model". In the yogurt data, price is a cost, not an income, hence I should multiply it by -1. Stata produces an error otherwise.
use "yogurt.dta", clear
quietly tab brand, gen(brandum) //generate dummies
gen mprice=-price // generate the -price because coefficients are assumed to be positively distributed
mixlogit choice, rand(feat brandum2-brandum4 mprice) ln(1) nrep(100) group(obsID) id(id) cluster(id)
Mixed logit model Number of obs = 9,648
Wald chi2(5) = 268.31
Log likelihood = -1251.3074 Prob > chi2 = 0.0000
(Std. err. adjusted for 100 clusters in id)
------------------------------------------------------------------------------
| Robust
choice | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
Mean |
feat | 1.239227 .3501429 3.54 0.000 .5529592 1.925494
brandum2 | -5.395541 .5076889 -10.63 0.000 -6.390593 -4.400489
brandum3 | -1.407874 .4732163 -2.98 0.003 -2.335361 -.4803867
brandum4 | .0078784 .2534039 0.03 0.975 -.4887841 .5045408
mprice | -.9232874 .2344312 -3.94 0.000 -1.382764 -.4638106
-------------+----------------------------------------------------------------
SD |
feat | 1.418924 .4383523 3.24 0.001 .5597698 2.278079
brandum2 | 2.234836 .2612979 8.55 0.000 1.722702 2.746971
brandum3 | 6.909907 1.104808 6.25 0.000 4.744522 9.075291
brandum4 | 3.314474 .303403 10.92 0.000 2.719815 3.909132
mprice | .4541621 .1691712 2.68 0.007 .1225927 .7857315
------------------------------------------------------------------------------
For reference, here is the estimation with logitr
using 100 multistart and with panelID specified. It still appears that coefficients are different and the standard errors on price are unreasonably large.
mxl_pref_yog_ln <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
panelID = "id",
clusterID = "id",
pars = c("price", "feat", "brand"),
randPars = c(price="ln",feat="n",brand="n"),
modelSpace = "pref",
numMultiStarts = 100
)
summary(mxl_pref_yog_ln)
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
price_mu -0.577539 1288.475860 -0.0004 0.9996424
feat_mu 1.398752 0.314016 4.4544 8.413e-06 ***
brandhiland_mu -4.417341 0.699108 -6.3185 2.640e-10 ***
brandweight_mu -3.426075 0.595650 -5.7518 8.828e-09 ***
brandyoplait_mu 1.375238 0.401803 3.4227 0.0006201 ***
price_sigma -0.011086 1777.104471 0.0000 0.9999950
feat_sigma 0.213427 0.201465 1.0594 0.2894295
brandhiland_sigma -2.667753 0.462004 -5.7743 7.727e-09 ***
brandweight_sigma 4.190470 0.740094 5.6621 1.496e-08 ***
brandyoplait_sigma 4.153300 1.058661 3.9232 8.739e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -1336.2405915
Null Log-Likelihood: -3301.3547078
AIC: 2692.4811829
BIC: 2750.3633000
McFadden R2: 0.5952448
Adj McFadden R2: 0.5922157
Number of Observations: 2412.0000000
Number of Clusters 100.0000000
I have also estimated a model using logitr
multiplying the price coefficient by -1.
yogurt$mprice <- -1*yogurt$price
mxl_pref_yog_ln <- logitr(
data = yogurt,
outcome = "choice",
obsID = "obsID",
panelID = "id",
clusterID = "id",
pars = c("mprice", "feat", "brand"),
randPars = c(mprice="ln",feat="n",brand="n"),
modelSpace = "pref",
numMultiStarts = 100
)
summary(mxl_pref_yog_ln)
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
mprice_mu 0.111951 0.011265 9.9378 < 2.2e-16 ***
feat_mu 0.852527 0.297664 2.8641 0.004183 **
brandhiland_mu -5.306878 0.698951 -7.5926 3.131e-14 ***
brandweight_mu -4.016736 0.630501 -6.3707 1.882e-10 ***
brandyoplait_mu 2.482373 4.589153 0.5409 0.588561
mprice_sigma -0.037894 0.022926 -1.6529 0.098354 .
feat_sigma 0.120210 0.531940 0.2260 0.821213
brandhiland_sigma -1.948122 0.443001 -4.3976 1.095e-05 ***
brandweight_sigma 4.855393 0.671178 7.2341 4.685e-13 ***
brandyoplait_sigma 4.571326 10.144816 0.4506 0.652273
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -1217.7184021
Null Log-Likelihood: -3416.9436545
AIC: 2455.4368041
BIC: 2513.3189000
McFadden R2: 0.6436235
Adj McFadden R2: 0.6406969
Number of Observations: 2412.0000000
Number of Clusters 100.0000000
Now standard errors appear to be more reasonable, but point estimates are still different. Not sure what is going on.
I hope this time I estimated the two models correctly!
Okay this is all looking a lot more reasonable. I completely forgot about the -1*price thing. This totally makes sense as to why you were getting rather nonsensical results when modeling price as log-normal. The price parameter should be negative, so if you're forcing positivity with a log-normal distribution you should convert price to the negative of price. That's why the first run you did where price was log-normal was having trouble converging - price should have been negative.
The last logitr model you ran looks correct to me, and I believe that is a correct comparison to Stata. I'm a bit confused why the price parameter in Stata is a negative number though. If you've set price to the negative of price, then the mean parameter should be positive (as it is in the logitr solution). This makes me wonder if Stata found a local minimum.
Otherwise, the results are much more similar. The brand coefficients are close, and the order of magnitude for the other parameters are about the same. It's still not a perfect match though as the algorithms reached different solutions: logitr LL is -1217, Stata LL is -1251. So logitr found a "better" solution in terms of fit.
You could also just try to compare the results with only one run in logitr (no multistart). The first run starts from parameters set to 0, which I believe is what Stata does, so that might get you closer to a similar solution.
At this point, I'm pretty sure the differences are just due to reaching different local minima.
Dear Helvy, thanks a lot for all of your support in this issue and thanks again for working so much on such a splendid package. I think I know why the price parameter is negative here. According to Hole 2007 "The estimated price parameters in the above model are the mean (bp) and standard deviation (sp) of the natural logarithm of the price coefficient. The median, mean, and standard deviation of the coefficient itself are given by exp(bp), exp(bp + s2 p/2), and exp(bp + s2 p/2) × ? exp(s2 p) − 1, respectively (Train 2009)". Hence the price coefficient is
------------------------------------------------------------------------------
choice | Coefficient Std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
mean_price | -.440363 .0774683 -5.68 0.000 -.5921981 -.2885279
med_price | -.3972111 .0931187 -4.27 0.000 -.5797204 -.2147018
sd_price | .2107663 .0678802 3.10 0.002 .0777235 .3438092
------------------------------------------------------------------------------
hence the coefficient on price is positive. I guess logitr already applies the transformation!
The thing that remains unclear to me is why results remain so different between the two softwares. The issue is that regardless of the number of multistarts I set up, results still appear to be significantly different especially on the price variable. as you can imagine this is essential for WTP estimation. For comparison, I am posting the results of the logitr
estimates and the Stata estimates both calculated with 1000 multistarts. I do not know, but it could be a Stata issue...
# R
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
mprice_mu 0.111954 0.011272 9.9324 < 2.2e-16 ***
feat_mu 0.852398 0.297662 2.8636 0.004188 **
brandhiland_mu -5.307112 0.699001 -7.5924 3.131e-14 ***
brandweight_mu -4.016775 0.630861 -6.3671 1.926e-10 ***
brandyoplait_mu 2.482547 4.588990 0.5410 0.588522
mprice_sigma -0.037894 0.022916 -1.6536 0.098205 .
feat_sigma 0.120193 0.531863 0.2260 0.821213
brandhiland_sigma -1.948202 0.442903 -4.3987 1.089e-05 ***
brandweight_sigma 4.855480 0.671721 7.2284 4.887e-13 ***
brandyoplait_sigma 4.571520 10.144256 0.4507 0.652241
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Log-Likelihood: -1217.7184020
Null Log-Likelihood: -3416.9436545
AIC: 2455.4368040
BIC: 2513.3189000
McFadden R2: 0.6436235
Adj McFadden R2: 0.6406969
Number of Observations: 2412.0000000
Number of Clusters 100.0000000
* STATA
Mixed logit model Number of obs = 9,648
Wald chi2(5) = 371.13
Log likelihood = -1252.5251 Prob > chi2 = 0.0000
(Std. err. adjusted for 100 clusters in id)
------------------------------------------------------------------------------
| Robust
choice | Coefficient std. err. z P>|z| [95% conf. interval]
-------------+----------------------------------------------------------------
Mean |
feat | 1.060867 .2265291 4.68 0.000 .6168781 1.504856
brandum2 | -4.997659 .5039539 -9.92 0.000 -5.985391 -4.009928
brandum3 | -1.169501 .233826 -5.00 0.000 -1.627791 -.7112102
brandum4 | -.0163061 .4169114 -0.04 0.969 -.8334375 .8008253
mprice | -.8676371 .1859296 -4.67 0.000 -1.232053 -.5032217
-------------+----------------------------------------------------------------
SD |
feat | 1.149806 .2542867 4.52 0.000 .6514134 1.648199
brandum2 | -2.65721 .5290221 -5.02 0.000 -3.694075 -1.620346
brandum3 | 8.547343 1.065777 8.02 0.000 6.458458 10.63623
brandum4 | 3.25122 .5163507 6.30 0.000 2.239191 4.263248
mprice | .5854347 .1059269 5.53 0.000 .3778218 .7930476
------------------------------------------------------------------------------
`
I think the differences we're seeing here are due to reaching different solutions, which can be the result of a few different factors, the starting points being one. Another is the number of draws used in the simulation. By default logitr uses just 50 halton draws (partly for speed). But even if I increase the draws to 1000, I still get a similar solution:
Model Coefficients:
Estimate Std. Error z-value Pr(>|z|)
mprice_mu 0.1073197 0.0044918 23.8926 < 2.2e-16 ***
feat_mu 1.0189873 0.2656793 3.8354 0.0001254 ***
brandhiland_mu -5.9172651 0.6970628 -8.4889 < 2.2e-16 ***
brandweight_mu -2.5357505 0.1733728 -14.6260 < 2.2e-16 ***
brandyoplait_mu 0.5006838 0.1952373 2.5645 0.0103328 *
mprice_sigma -0.0445569 0.0038197 -11.6650 < 2.2e-16 ***
feat_sigma -1.1259761 0.3518291 -3.2003 0.0013726 **
brandhiland_sigma -3.1527737 0.4953801 -6.3644 1.961e-10 ***
brandweight_sigma 6.0014793 0.4825431 12.4372 < 2.2e-16 ***
brandyoplait_sigma -4.2985297 0.9001629 -4.7753 1.795e-06 ***
Log-Likelihood: -1225.2775566
At this point, I can't explain the differences, but it doesn't look like the result of an error somewhere (e.g., unreasonably huge standard errors). I think it's just the algorithms reaching different local minima.
Perhaps it's worth trying to estimate the same models using the mlogit package and / or the gmnl package for comparison? There's also apollo. I primarily wrote logitr to be fast and to support WTP space models, but there are certainly others that can be used for validation.
Dear Helvy, thanks a lot for all your clarification and helpfulness throughout the process!
You command is great, fast and accessible! I hope it will soon allow for correlation across parameters in the mixed logit!
Correlation is in the works! I appreciate the feedback. Hearing from users is the best way to improve / fix issues with the package.
@Ales-G just wanted to let you know that I indeed did have a bug in the previous examples here where the negative of price follows a log-normal distribution. It once again had to do with re-scaling the parameters post-estimation (the data are internally scaled prior to estimation to make it easier to converge). I'm still working to finish up final edits to a new branch, which will eventually become v0.6.0, but you can install it here:
devtools::install_github("jhelvy/logitr@correlation")
If you don't mind, it'd be helpful to see if you're getting similar results with Stata now using log-normal parameters. Note as I mentioned before that there is a good chance that logitr may reach a "better" solution in terms of the log-likelihood compared to Stata as it depends on the starting parameters. Using a multistart with logitr will usually find a better log-likelihood. So checking the log-likelihood value at convergence is helpful for making sure they indeed reached the same solutions. But even with slightly different solutions, the parameters themselves should be relatively similar in terms of the sign and magnitude.
This should now be resolved as of v0.7.0.
Hello, first of all, thank you very much for developing this package it is really amazing and of great help.
I have a question regarding the clustering of standard errors. I am comparing the results of a multinomial logit model in preference space with those that I get in Stata and something appears to go wrong once I cluster standard errors. If I run a basic mnl model with yogurt data and cluster standard errors by respondent these appear to explode (all p values>.9)
However if run the same model in Stata, using the
clogit
command the effect of clustering standard errors is less extreme.Note coefficients are the same, and that when I do not cluster standard errors, these are almost identical regardless of whether I used logitr or clogit.
Is there something going on with the clustering or I am misspecifying something?
thanks a lot in advance for clarifying this!
Your work is truly helpful!