for i,j,w in edge_list:
if i in vertex_cache and j not in vertex_cache:
mst[i].append(j)
vertex_cache.add(j)
elif i not in vertex_cache and j in vertex_cache:
mst[j].append(i)
vertex_cache.add(i)
This code is wrong. Since once an edge i->j both not in vertex_cache, It will not be considerred any longer. Even later, when one of them, say i, is added to vertex_cache, apparently i->j would be a safe link, but won't be considerred, leading to worse spanning tree.
If you are implementing the Kruskal's algorithm, then you should consider the isolated components in current forest, and include the edge when FIND-SET(i) ~= FIND-SET(j).
If you are implementing the Prim's algorithm, then you probably should use a while loop. And inside it, use priority queue (max heap) to add max weight node into the mst.
In pyBN.learning.structure.tree.chow_liu
This code is wrong. Since once an edge i->j both not in vertex_cache, It will not be considerred any longer. Even later, when one of them, say i, is added to vertex_cache, apparently i->j would be a safe link, but won't be considerred, leading to worse spanning tree.
If you are implementing the Kruskal's algorithm, then you should consider the isolated components in current forest, and include the edge when FIND-SET(i) ~= FIND-SET(j).
If you are implementing the Prim's algorithm, then you probably should use a while loop. And inside it, use priority queue (max heap) to add max weight node into the mst.