Labo-Lacourse / stepmix

A Python package following the scikit-learn API for model-based clustering and generalized mixture modeling (latent class/profile analysis) of continuous and categorical data. StepMix handles missing values through Full Information Maximum Likelihood (FIML) and provides multiple stepwise Expectation-Maximization (EM) estimation methods.
https://stepmix.readthedocs.io/en/latest/index.html
MIT License
54 stars 4 forks source link

Fix dimension mismatch in Multinomial model #18

Closed MostafaAbdelrashied closed 1 year ago

MostafaAbdelrashied commented 1 year ago

Description: This pull request fixes a dimension mismatch bug in the Multinomial class. The bug occurs when fitting a sample dataset and trying to predict another dataset with different dimensions. Instead of predicting the new dataset, the model throws an a ValueError.

Fixes #17

To fix the bug, I made the following changes:

To validate the fix, I tested the modified model class on a a training dataset for fit and testing dataset for predict and confirmed that the model now works as expected.

Please let me know if there are any further changes I can make to improve the fix.

sachaMorin commented 1 year ago

Good catch. The aim of the one-hot cache was to save computations during EM. It's not essential, however, and deactivating it is the way to go for now.