py-why / causal-learn

Causal Discovery in Python. It also includes (conditional) independence tests and score functions.
https://causal-learn.readthedocs.io/en/latest/
MIT License
1.19k stars 196 forks source link

Passing required domain knowledge using add_required_by_node #116

Closed sirineBS closed 1 year ago

sirineBS commented 1 year ago

Hello, I'm new in causal discovery, and I'm not sure if it's an issue in the causal-learn package or if I'm using wrongly the add_required_by_node function. I first built a graph based on data (without any knowledge), let's call it G1. Then I wanted to add required edges based on domain expertise to create a new graphe G2. The expected result : G2 will contain all directed edges defined by domain knowledge. The observed result : G2 contains the directed edge (if the edge was already present in G1), otherwise the directed edge is not created.


Below the details :

from causallearn.search.ConstraintBased.PC import pc
      data_np = data.to_numpy()
      cg = pc(data_np)

below the result : G1

forbidden_edges = []
required_edges =  [(0, 1), (2, 3), (1, 2)] # {"a":0, "b":1, "c":2, "d":3}
data_np = data.to_numpy()
cg = pc(data_np)
nodes = cg.G.get_nodes()

bk = BackgroundKnowledge()
for (node_1_idx, node_2_idx) in forbidden_edges:
    bk.add_forbidden_by_node(nodes[node_1_idx], nodes[node_2_idx])

for (node_1_idx, node_2_idx) in required_edges:
    bk.add_required_by_node(nodes[node_1_idx], nodes[node_2_idx])

cg_with_knowledge = pc(data_np, background_knowledge=bk)

below the result : G2

While I was expecting more something like this (results based on another package): G3

zhi-yi-huang commented 1 year ago

Hi @sirineBS! When orienting based on background knowledge, the algorithm will only orient the undirected edges in the causal graph, but will not add additional edges to the causal graph. For example, it only orients a-b to a->b in G2, but not b-c and c-d. Because the undirected edges b-c and c-d were deleted by skeleton search in PC algorithm.

sirineBS commented 1 year ago

Hi @sirineBS! When orienting based on background knowledge, the algorithm will only orient the undirected edges in the causal graph, but will not add additional edges to the causal graph. For example, it only orients a-b to a->b in G2, but not b-c and c-d. Because the undirected edges b-c and c-d were deleted by skeleton search in PC algorithm.

Hi @zhi-yi-huang , thank you very much for your answer ! Do you know if there is a way with causal-learn package to get the result I want ? (basically something that instead of applying search with PC then complete some edge directions with background knowledge, first create edges based on domain knowledge then complete the process discovery with PC algo ).

zhi-yi-huang commented 1 year ago

Perhaps merge the causal edges based on domain knowledge to the result of the PC algorithm. But it may cause the causal graph to have loops. I'm not sure if this is theoretically feasible.