Add pseudo-class technique as an alternative to BCH for mixture models

Gootjes commented 8 months ago

Description of the technique quoted from this paper:

"A method developed by Wang, Bandeen-Roche, and Hendricks Brown (Citation2005), first estimates the latent class model, and then the latent class variable is handled through MI using the posterior distribution obtained by the model. This process is followed by analyzing the imputed class variables together with the covariates using the MI technique developed by Rubin (Citation1987; see also Asparouhov & Muthén, Citation2007)."

References:

Paper recommending, among others, the pseudo-class technique. The GRoLTS-Checklist: Guidelines for Reporting on Latent Trajectory Studies. https://doi.org/10.1080/10705511.2016.1247646
Paper describing the pseudo-class technique. Residual Diagnostics for Growth Mixture Models. https://doi.org/10.1198/016214505000000501
Paper describing pooling results. Applied missing data analysis with SPSS and (R) Studio. Chapter Rubin's Rules. https://bookdown.org/mwheymans/bookmi/rubins-rules.html
Source for the magic number of 20 draws from the posterior distribution. Wald Test of Mean Equality for Potential Latent Class Predictors in Mixture Modeling. https://www.statmodel.com/download/MeanTest1.pdf

cjvanlissa commented 8 months ago

@Gootjes this looks very good already; my only limitation now is that I don't have time to get into the code until after March 26th! Few things I do see at first glance are:

1) I'll need to check whether the user-facing interface is consistent with other functions of the package; e.g., is it similar to BCH(), is it amenable to the tidy* style syntax 2) Not sure whether importing rlang::eval_tidy is necessary 3) Not sure whether adding a dependency on mice is desireable; is it possible to either implement Rubin's pooling rules directly, or to make this a suggest? 4) In addition to the documentation, we should add an example in one of the package vignettes.

Gootjes commented 8 months ago

I took BCH's arguments as the starting point. Instead of BCH's (x, model, data, ...), I use those in the same order but named them (fit, analysis, x). I am fine with renaming these of course. Since broom::tidy() and broom::glance() is used by mice::pool, but those two don't have an implementation for OpenMx objects nor lavaan objects, supporting tidy_sem models would require implementing pooling myself.
Removed it
Added it to suggests. Given the situation of 1, might need to implement it directly,
Would be nice indeed!

maugavilla commented 8 months ago

Hi

This is an interesting approach, similar to "plausible values" in SEM. But a disadvantage in those moethods is that is not clear if they retain the non linear relations and other characterictics. Is something we could test further

About the code, I dont have the time to look at it right now. But could also use the general runction mi.meld from Amelia, which takes the parameters estimates and se from each MI analysis and combine them. This way its independent of MI and analysis methods

Gootjes commented 8 months ago

Hi

This is an interesting approach, similar to "plausible values" in SEM. But a disadvantage in those moethods is that is not clear if they retain the non linear relations and other characterictics. Is something we could test further

If you know about a dataset and hypothesis to test this, let me know!

About the code, I dont have the time to look at it right now. But could also use the general runction mi.meld from Amelia, which takes the parameters estimates and se from each MI analysis and combine them. This way its independent of MI and analysis methods

I guess mice and Amelia have very similar implementations. I like mice because it gives diagnostics about the imputation performance too, but I guess this is less relevant in this special imputation scenario.

I just pushed a pool function for MxModel and lavaan, which is largely based on mice's function. How do I credit them correctly?

cjvanlissa commented 8 months ago

Thank you for your excellent work on this, Frank!

I just pushed a pool function for MxModel and lavaan, which is largely based on mice's function. How do I credit them correctly?

If you're literally copy-pasting code from mice, as per CRAN policies, the mice authors need to be credited as contributors. Since they have dozens (hundreds?) of authors, I feel like that would wash out the smaller team working on tidySEM, including yourself. Maybe keep that in mind when weighting the different possible approaches. You can also include mice as a depends or suggests, although that does increase the number of dependencies of tidySEM, which is not great but also not the end of the world. Furthermore, if it helps, you can import internal functions from packages using getFromNamespace(). I've used this in the past to write a wrapper around existing functions from other packages, rather than copy-pasting their code and thus incurring co-authors.

cjvanlissa commented 8 months ago

@Gootjes one more question: Do you know how to write integration tests? We require them to ensure that the package functions as expected and remains functioning after future changes are implemented in our package or dependencies. They live in the tests/testthat directory. If you don't know how to do this, then just point me to a reproducible example of the pseudo-class method, and I'll write the tests.

Gootjes commented 8 months ago

@Gootjes one more question: Do you know how to write integration tests? We require them to ensure that the package functions as expected and remains functioning after future changes are implemented in our package or dependencies. They live in the tests/testthat directory. If you don't know how to do this, then just point me to a reproducible example of the pseudo-class method, and I'll write the tests.

Done!

I would love to include a test that compares the results according to known results, but can't find any at the moment (with open data).

cjvanlissa commented 8 months ago

@Gootjes I'm reviewing your code now and making a few interface changes to align it more closely with the tidySEM philosophy; I see you're also working on it now - could you hang on a minute and then work from my commit instead?

Gootjes commented 8 months ago

I also altered the interface slightly, there are no commits from me after https://github.com/cjvanlissa/tidySEM/pull/85/commits/af70cd60113d854739e1ff6b9c47b2e8cd860715

Gootjes commented 8 months ago

I just noticed the errors in the CI, I wrote the fixes here: https://github.com/cjvanlissa/tidySEM/compare/master...Gootjes:tidySEM:pseudo-class-analysis?expand=1

Didn't expect you to merge!

@cjvanlissa

cjvanlissa / tidySEM

Add pseudo-class technique as an alternative to BCH for mixture models #85