Passing required domain knowledge using add_required_by_node

sirineBS commented 1 year ago

Hello, I'm new in causal discovery, and I'm not sure if it's an issue in the causal-learn package or if I'm using wrongly the add_required_by_node function. I first built a graph based on data (without any knowledge), let's call it G1. Then I wanted to add required edges based on domain expertise to create a new graphe G2. The expected result : G2 will contain all directed edges defined by domain knowledge. The observed result : G2 contains the directed edge (if the edge was already present in G1), otherwise the directed edge is not created.

Below the details :

data:
First I create a graph without any knowledge (G1), using this code :

from causallearn.search.ConstraintBased.PC import pc
      data_np = data.to_numpy()
      cg = pc(data_np)

below the result :

Second I wanted to add domain knowledge rule rules = [
```
("a", "b", "required"), #a->b
("c", "d", "required"), #c->d
("b", "c", "required"), #b->c
```
] I used add_required_by_node and add_forbidden_by_node methods to add either required or forbidden edges and to create the graph G2, using this code:

forbidden_edges = []
required_edges =  [(0, 1), (2, 3), (1, 2)] # {"a":0, "b":1, "c":2, "d":3}
data_np = data.to_numpy()
cg = pc(data_np)
nodes = cg.G.get_nodes()

bk = BackgroundKnowledge()
for (node_1_idx, node_2_idx) in forbidden_edges:
    bk.add_forbidden_by_node(nodes[node_1_idx], nodes[node_2_idx])

for (node_1_idx, node_2_idx) in required_edges:
    bk.add_required_by_node(nodes[node_1_idx], nodes[node_2_idx])

cg_with_knowledge = pc(data_np, background_knowledge=bk)

below the result :

While I was expecting more something like this (results based on another package):

zhi-yi-huang commented 1 year ago

Hi @sirineBS! When orienting based on background knowledge, the algorithm will only orient the undirected edges in the causal graph, but will not add additional edges to the causal graph. For example, it only orients a-b to a->b in G2, but not b-c and c-d. Because the undirected edges b-c and c-d were deleted by skeleton search in PC algorithm.

sirineBS commented 1 year ago

Hi @sirineBS! When orienting based on background knowledge, the algorithm will only orient the undirected edges in the causal graph, but will not add additional edges to the causal graph. For example, it only orients a-b to a->b in G2, but not b-c and c-d. Because the undirected edges b-c and c-d were deleted by skeleton search in PC algorithm.

Hi @zhi-yi-huang , thank you very much for your answer ! Do you know if there is a way with causal-learn package to get the result I want ? (basically something that instead of applying search with PC then complete some edge directions with background knowledge, first create edges based on domain knowledge then complete the process discovery with PC algo ).

zhi-yi-huang commented 1 year ago

Perhaps merge the causal edges based on domain knowledge to the result of the PC algorithm. But it may cause the causal graph to have loops. I'm not sure if this is theoretically feasible.

py-why / causal-learn

Passing required domain knowledge using add_required_by_node #116