Closed GabrielAzevedoFerreiraQB closed 3 years ago
Raised an issue on pgmpy as well
This appears to have been fixed in pgmpy=0.1.9
I'll investigate bumping version and will update soon. For now though, you may be able to update pgmpy version.
Just a note: should we update the requirement in causalnex to pgmpy=0.1.9 ?
I saw pgmpy>=0.1.12, <0.2.0
in requirements.txt
now. Does this resolve the issue?
Yes, now the actual result is same as expected result. Please see my notebook below-
Description
If a Node has only one parent (e.g. A->B) this node is always assigned to the flat distribution when we fit the probabilities.
I dig in and found out that problem turns out to come from PGMPY. I will raise the same issue there too, but am not sure how we want to handle it in CausalNex in the meantime.
Steps to Reproduce
Expected Result
Actual Result
Your Environment
CausalNex version used (
pip show causalnex
):Python version used (
python -V
): Python 3.6.10 :: Anaconda, Inc.Operating system and version: MAC OS
pandas version: 0.24
CAUSE:
This comes is from PGMPY, precisely file
pgmpy/estimators/base.py
, ~ line 127.If the node has more than one parent,
state_count_data
columns will beMultiIndex
from the start. So doingstate_count_data.reindex(...,columns=column_index)
causes no problem.If the node has one single parent, however,
state_count_data
columns will not beMultiIndex
, but just "normal" indexing. In that case, when doingstate_count_data.reindex(...,columns=column_index)
the result is dataframe full of NAs.Dirty solution:
convert
state_count_data.columns
toMultiindex
before reindexing