ncullen93 / pyBN

Bayesian Networks in Python
MIT License
146 stars 55 forks source link

Chow Liu Tree code wrong #21

Open eelxpeng opened 6 years ago

eelxpeng commented 6 years ago

In pyBN.learning.structure.tree.chow_liu


for i,j,w in edge_list:

        if i in vertex_cache and j not in vertex_cache:

            mst[i].append(j)

            vertex_cache.add(j)

        elif i not in vertex_cache and j in vertex_cache:

            mst[j].append(i)

            vertex_cache.add(i)

This code is wrong. Since once an edge i->j both not in vertex_cache, It will not be considerred any longer. Even later, when one of them, say i, is added to vertex_cache, apparently i->j would be a safe link, but won't be considerred, leading to worse spanning tree.

If you are implementing the Kruskal's algorithm, then you should consider the isolated components in current forest, and include the edge when FIND-SET(i) ~= FIND-SET(j).

If you are implementing the Prim's algorithm, then you probably should use a while loop. And inside it, use priority queue (max heap) to add max weight node into the mst.