r-causal / causal-inference-in-R

Causal Inference in R: A book!
https://www.r-causal.org/
195 stars 50 forks source link

16.01: The Parametric G-Formula #95

Open malcolmbarrett opened 2 years ago

LucyMcGowan commented 1 year ago
library(tidyverse)

n <- 1000
d <- tibble(
  c = rnorm(n),
  x = as.numeric(c + rnorm(n) > 0),
  y = x + c + rnorm(n)
)

mod <- lm(y ~ x + c + x*c, data = d)
coef(mod)

new_data1 <- tibble(
  c = d$c,
  x = 1
)
new_data0 <- tibble(
  c = d$c,
  x = 0
)

mean(predict(mod, new_data1) - predict(mod, new_data0))

m1 <- lm(y ~ c, data = d[d$x == 1,])
m0 <- lm(y ~ c, data = d[d$x == 0,])

mean(predict(m1, d) - predict(m0, d))

Big ideas:

malcolmbarrett commented 1 year ago

Another simulation based on the above:

n <- 10000
library(tidyverse)
c <- rbinom(n, 1, 0.4)
x <- rbinom(n, 1, ifelse(c == 1, 0.5, 0.2))
y <- x + c + 0 * x * c + rnorm(n)

data <- tibble(
  x, 
  y, 
  c
)
mod <- lm(y ~ x + c, data)
mod |>
  broom::tidy()

data_1 <- data |>
  mutate(x = 1)
data_0 <- data |>
  mutate(x = 0)

data_1$p <- predict(mod, newdata = data_1)
data_0$p <- predict(mod, newdata = data_0)

mean(data_1$p) - mean(data_0$p)