Labo-Lacourse / stepmix

A Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods.
https://stepmix.readthedocs.io/en/latest/index.html
MIT License
60 stars 4 forks source link

Optimization of categorical code #38

Open sachaMorin opened 1 year ago

sachaMorin commented 1 year ago

Current categorical model one-hot encodes integer data every time we call the E or M step in the optimisation loop. We could probably obtain a meaningful speedup if we cached the one-hot encodings.