Labo-Lacourse / stepmix

A Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods.
https://stepmix.readthedocs.io/en/latest/index.html
MIT License
59 stars 4 forks source link

Predicted Probabilities and Marginal Effects from structural models? #67

Open huentelb opened 1 month ago

huentelb commented 1 month ago

Hi all,

thanks for providing this great package and making the application of bias-adjusted LCA/LPA available as open source!

I was wondering if there is a possibility to calculate predicted probabilities or (average) marginal effects from the structural model. And if this is also possible based on the bootstrapped models so that there is information about the confidence in the estimates. The background is that I would like to obtain predicted probabilities for each structural variable for each class, as could easily be obtained with 'margins' or 'marginaleffects' in R or stata, for instance, following a multinomial logistic regression.

I appreciate any support!

Thanks Bettina

FelixLaliberte commented 1 month ago

Hello Bettina,

It depends on the distribution of your external variables. For models with binary or categorical distal outcomes, the structural parameters are already predicted probabilities (pis). These probabilities can be bootstrapped.

Here is a short tutorial showing how to bootstrap structural parameters, obtain multinomial regression coefficients (for models with covariates), and derive p-values from bootstrapped parameters: https://colab.research.google.com/drive/1oaofJ68eHjahSPNty75npBzws3JuAjPO?usp=sharing

Felix

huentelb commented 1 month ago

Hi Felix,

thanks for your fast response! And amazing, I didn’t understand from the documentation that the pis are predicted probabilities. Bootstrapping etc. worked fine already.

Another quick question (I hope it’s ok to just continue this issue like this): do you know if a positive ll is something to worry about? From what I read in various forums it seems to be ok, but I’m still not quite sure.

Thanks again and best Bettina

huentelb commented 1 month ago

Hi Felix,

another question regarding the pis of categorial variables, say 'Cov' with four categories coded as binary indicators ('Cov1', 'Cov2', 'Cov3', 'Cov4') in the structural model: When I want to calculate 'classic logits' I leave out the reference category (e.g. Cov1) and calculate the pis for the remaining (Cov2-Cov4), as documented in your examples. As for predicted probabilities, I assume I would have to include binary variables for all categories of Cov (Cov1-Cov4), correct?

Many thanks Bettina

FelixLaliberte commented 1 month ago

Hello,

For your first question, yes, the LL may be positive.

For your second question, in the tutorial, we show how to obtain normalized regression coefficients (BNs) for models with covariates. StepMix does not directly provide predicted probabilities. However, these probabilities can be calculated manually. Perhaps some packages can be used to obtain these probabilities?