Closed mn-tk closed 2 years ago
Thanks for looking carefully! I could not find any obvious code differences in the bnlearn code that could result in the difference. All the input parameters are the same, and there are also no differences in the example data. I did however see an differences deeper in the code of pgmpy. In the hillclimbsearch.py, line 318, the edges are differently created for some reason.
I stripped down the entire code to these few lines of pgmpy.
import bnlearn as bn
print(bn.__version__)
from pgmpy.estimators import BicScore
from pgmpy.estimators import HillClimbSearch
df = bn.import_example('sprinkler')
scoring_method = BicScore(df)
model = HillClimbSearch(df)
best_model = model.estimate(scoring_method=scoring_method, tabu_length=100, epsilon=0.0001, max_iter=1000000, fixed_edges=set(), show_progress=False)
print(best_model.edges())
Thus Im not sure why the differences occur, I only see that they seem to happened deeper in the code of pgmpy.
It took a while to figure this out.
The HillClimbSearch iteratively makes an operation on the DAG and updates the score. First, a Set is created from potential_new_edges in _legal_operations() and is then returned together with the score_delta. The following step is that the function estimate() computes the score_delta to determine the best_operation.
Here comes the reason for this issue; (1) The _legal_operations() uses a Set for which the order is not pre-defined and also changes for each Python run (check this note https://docs.python.org/3/reference/datamodel.html#object.__hash__), and (2) Multiple edges exists with exactly the same (maximum) score_delta. Because of points 1 and 2 together, each time you run the code, different operations can be used to iteratively build the DAG, and thus with different outcomes.
If the sprinkler example would have contained a larger DAG, (more nodes or states), then HillClimbSearch may not result in exactly the same maximum score_delta values for different edges. However, this is still one of the 2 points that create the difference.
The bottom line is; this issue is not a bug. It simply finds exactly the same scores for edges together with Python functionality that handles the ordering. This would also explain that different systems, notebooks, versions, or simply runs can result in differences.
If there are any other questions regarding this issue, let me know.
Currently, when I ran bnlearn.ipynb in Colab with bnlearn==0.5.1 and pgmpy==0.1.17 (https://colab.research.google.com/github/erdogant/bnlearn/blob/master/notebooks/bnlearn.ipynb) The edge of "Sprinkler"-"Cloudy" and the edge of "Cloudy"-"Rain" are opposite to the tutrial results.
Forthermore I ran bnlearn.ipynb in Colab with bnlearn==0.4.11 and pgmpy==0.1.17 as well, and the results were the same as the tutrial results.
Why do the results change due to the upgrade of bnlearn?