Open djinnome opened 3 years ago
From figure 1(a) of On the Testable Implications of Causal Models with Hidden Variables
The ADMG is represented in causaleffect
as:
library(causaleffect)
library(igraph)
library(ggm)
g <- graph.formula(a -+ b, b -+ c, c -+ d , b -+ d, d -+ b, simplify = FALSE)
g <- set.edge.attribute(graph = g, name = "description", index = c(4,5), value = "U")
and the Verma constraint is:
verma.constraints(g)
[[1]]
[[1]]$rhs.cfactor
[1] "Q[\\{d\\}](c,d)"
[[1]]$rhs.expr
[1] "\\sum_{u_{1},c}P(d|u_{1},c)P(c)P(u_{1})"
[[1]]$lhs.cfactor
[1] "\\sum_{b}Q[\\{b,d\\}](a,b,c,d)"
[[1]]$lhs.expr
[1] "\\sum_{b}P(d|a,b,c)P(b|a)"
[[1]]$vars
[1] "a"
So for the Q construct, do we need to introduce a new kind of DSL element, or would it be enough to just have another named tuple representing the contents of this kind of expression for this algorithm?
Every Probabiistic expression has an implicit Q type associated with it. It would be really cool if each Probabilistic expression knew its Q type, and there was a predicate that could answer whether an expression was of a particular Q type.
Sincerely,
Jeremy
Here is an example of a graph with multiple verma constraints, where the first verma constraint contains multiple variables in the $vars
slot.
which means that $Q[\{e\}](d,e) _||_ b,c$
, where
$$Q[\{e\}](d,e) = \frac{Q[\{c,e\}](b,c,d,e)}{\sum_{e}Q[\{c,e\}](b,c,d,e) = \frac{\sum_{a}P(e|a,b,c,d)P(c|a,b)P(a)}{\sum_{a,e}P(e|a,b,c,d)P(c|a,b)P(a)}$$
(I wish github comments could include latex rendering)
Interestingly, the verma constraint for the same graph has a different denominator in CausalFusion.net!
g <- graph.formula(a -+ b, b -+ c, c -+ d , d -+ e, a -+ c, c -+ a, a -+ e, e -+ a, simplify = FALSE)
g <- set.edge.attribute(graph = g, name = "description", index = c(5,6), value = "U")
g <- set.edge.attribute(graph = g, name = "description", index = c(7,8), value = "U")
verma.constraints(g)
[[1]]
[[1]]$rhs.cfactor
[1] "Q[\\{e\\}](d,e)"
[[1]]$rhs.expr
[1] "\\sum_{u_{2},d}P(e|u_{2},d)P(d)P(u_{2})"
[[1]]$lhs.cfactor
[1] "\\frac{Q[\\{c,e\\}](b,c,d,e)}{\\sum_{e}Q[\\{c,e\\}](b,c,d,e)}"
[[1]]$lhs.expr
[1] "\\frac{\\sum_{a}P(e|a,b,c,d)P(c|a,b)P(a)}{\\sum_{a,e}P(e|a,b,c,d)P(c|a,b)P(a)}"
[[1]]$vars
[1] "b" "c"
[[2]]
[[2]]$rhs.cfactor
[1] "Q[\\{a,e\\}](a,d,e)"
[[2]]$rhs.expr
[1] "\\sum_{u_{2},d,u_{1}}P(e|u_{2},d)P(a|u_{1},u_{2})P(u_{1})P(d)P(u_{2})"
[[2]]$lhs.cfactor
[1] "\\sum_{c}Q[\\{a,c,e\\}](a,b,c,d,e)"
[[2]]$lhs.expr
[1] "\\sum_{c}P(e|a,b,c,d)P(c|a,b)P(a)"
[[2]]$vars
[1] "b"
[[3]]
[[3]]$rhs.cfactor
[1] "Q[\\{e\\}](d,e)"
[[3]]$rhs.expr
[1] "\\sum_{u_{2},d}P(e|u_{2},d)P(d)P(u_{2})"
[[3]]$lhs.cfactor
[1] "\\sum_{a,c}Q[\\{a,c,e\\}](a,b,c,d,e)"
[[3]]$lhs.expr
[1] "\\sum_{a,c}P(e|a,b,c,d)P(c|a,b)P(a)"
[[3]]$vars
[1] "b"
Input: ADMG Output: Probability expression is independent of a set of nodes.
Note: this algorithm has already been implemented in the R package
causaleffect
. In #31, thecausaleffect
implementation was wrapped and made available through they0
interface. However, we would like to provide our own implementation, since doing this will give us insight that can not be found by reading the R code.There is already a data structure that represents a Verma constraint in https://github.com/y0-causal-inference/y0/blob/main/src/y0/struct.py. Further, the list of Verma constraints with several example graphs from Causal Fusion can be imported from https://github.com/y0-causal-inference/y0/blob/main/src/y0/examples.py
Both of these allow for easy test-driven development.