discrepancy between calc_lrt and Mplus LRT test

albertostefanelli commented 2 years ago

As per title, I noticed a discrepancy between the Mplus and the calc_lrt calculation of the adjusted likelihood ratio test (LRT) as in Formula 15 of Lo, Mendell, & Rubin (2001).

This is a set of models fitted using Mplus. The calc_lrt p.value is calculated using calc_lrt(). The T11_LMR_PValue is the pvaule extracted from the Mplus output.

Classes	N	LL	Parameters	calc_lrt p.value	T11_LMR_PValue
1	1121	-12702.19	16	NA	NA
2	1121	-12034.91	20	0	0.0000
3	1121	-11829.14	24	0	0.0106
4	1121	-11682.76	28	0	0.0389
5	1121	-11599.70	32	0	0.1551
6	1121	-11542.82	36	0	0.0746
7	1121	-10958.63	40	0	0.4800
8	1121	-10921.04	44	0	0.4256
9	1121	-10877.16	48	0	0.4591

calc_lrt() is used as follow, where i is each row of the table reported above.

  summarymix[i+1,paste0("calc_lrt p.value")]  <-  round(tidyLPA::calc_lrt(summarymix[i,"N"],
                                                                     summarymix[i,"LL"], 
                                                                     summarymix[i,"Parameters"],
                                                                     summarymix[i,"Classes"],
                                                                     summarymix[i+1,"LL"],
                                                                     summarymix[i+1,"Parameters"],
                                                                     summarymix[i+1,"Classes"])["lmr_p"],2)

For instance, for the 4 (H0) VS 5 class solution, the code would look like

calc_lrt(1121,
      -11682.76, 
      28,
      4,
      -11599.70,
      32,
      5)["lmr_p"]

0.000000000000000000000000000000002930578 

calc_lrt(1121,
      -11682.76, 
      28,
      4,
      -11599.70,
      32,
      5)

LR = 166.120, LMR LR (df = 4) = 158.592, p < 0.001

The returned p-value is close to 0 for the calc_lrt() while Mplus calculate it as 0.1551. The Mplus output is the following.

     VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 4 (H0) VERSUS 5 CLASSES

          H0 Loglikelihood Value                       -11682.756
          2 Times the Loglikelihood Difference            166.109
          Difference in the Number of Parameters                4
          Mean                                             11.541
          Standard Deviation                              146.702
          P-Value                                          0.1460

     LO-MENDELL-RUBIN ADJUSTED LRT TEST

          Value                                           160.398
          P-Value                                          0.1551

This shows a different in both the LRT value and in the p-value for the LMR LR.

cjvanlissa commented 2 years ago

@albertostefanelli I know, the results are discrepant - and I don't think calc_lrt() should be used; it was included by adamant request of one user. But do you know what is different about calc_lrt() compared to Mplus? AFAIK, it just implements the method from the LMR paper.

cjvanlissa commented 2 years ago

lmr <- function(ll_km1, par_km1, class_km1, ll_k, par_k, class_k, n){
  p <- (3 * class_k - 1)
  q <- (3 * class_km1 - 1)
  lr <- -2 * (ll_km1 - ll_k)
  lr_adj <- lr / (1 + ((p - q) * log(n)) ^ -1)
  df <- par_k - par_km1
  lmrt_p <- pchisq(lr_adj, par_k - par_km1, lower.tail = FALSE)
  c(chi2 = lr_adj, df = df, p = lmrt_p)
}

albertostefanelli commented 2 years ago

this is very odd. the formula seems correct and Mplus clearly states to implement the method from the 2001 paper.

In addition, I noticed that the LL of the H0 model changes substantially.

For instance, in the example above the best-LL for the 6-class model with STARTS= 1000 250 is -11542.82. The best-LL is replicated more than 30 times so we are quite confident the model has reached a stable solution.

When fitting a 7-class model, the LL for k-1 with the same exact starts (K-1STARTS = 1000 250) is far far lower than the one estimated previously. The H0 value is now -11031.440. I report here under the full Mplus output.


TECHNICAL 11 OUTPUT

     Random Starts Specifications for the k-1 Class Analysis Model
        Number of initial stage random starts                1000
        Number of final stage optimizations                   250

     VUONG-LO-MENDELL-RUBIN LIKELIHOOD RATIO TEST FOR 6 (H0) VERSUS 7 CLASSES

          H0 Loglikelihood Value                       -11031.440
          2 Times the Loglikelihood Difference            145.624
          Difference in the Number of Parameters                4
          Mean                                             79.929
          Standard Deviation                             1208.592
          P-Value                                          0.4783

     LO-MENDELL-RUBIN ADJUSTED LRT TEST

          Value                                           140.617
          P-Value                                          0.4800

I think we should write to M&M for both issues. @cjvanlissa do you have a licence ?

cjvanlissa commented 2 years ago

There are two possible explanations. One, I am still not perfectly confident that my implementation of LMR is correct. Two, Mplus often uses some "secret recipes" that they are not fully transparent about, so maybe they deviate from the paper. I am not strongly motivated to take the issue up with them.

albertostefanelli commented 2 years ago

I quickly checked your of implementation lmr and it does not raise any flag for me. However, I am going to investigate it a bit further in the next couple of days.

Yeah, I thought of the "secret recipes" option as well. It might be that the the k-1 model is estimated with a 'better' set of random starts (in Mplus STSEED =) that are taken from the k model with the best LL. However, in my opinion, this would defeat the entire purpose of LMR.

cjvanlissa commented 2 years ago

Thank you for your perspective on this!

data-edu / tidyLPA

discrepancy between calc_lrt and Mplus LRT test #187