python-qds / qdscreen

Quasi-determinism screening for fast Bayesian Network Structure Learning (from T.Rahier's PhD thesis, 2018)
https://python-qds.github.io/qdscreen/
BSD 3-Clause "New" or "Revised" License
2 stars 0 forks source link

IndexingError: Unalignable boolean Series provided as indexer (index of the boolean Series and of the indexed object do not match). #37

Closed smarie closed 1 year ago

smarie commented 1 year ago

Linked with #36

This bug arises when some columns in the dataframe are not categorical, and therefore are removed by the model. If the same columnsa re provided later to fit_selector for example, the error is raised

df = pd.DataFrame({
    "nb": [1, 2],
    "name": ["A", "B"]
})
qd_forest = qd_screen(df, categorical_mode="convert")
feat_selector = qd_forest.fit_selector_model(df)
only_important_features_df = feat_selector.remove_qd(df)

A good idea would be to protect our method against invalid inputs (not the expected names or data)