kenkoonwong / everyday_is_a_school_day

0 stars 0 forks source link

blog/collider_adjustment/ #2

Open utterances-bot opened 1 year ago

utterances-bot commented 1 year ago

What Happens If Our Model Adjustment Includes A Collider? | Everyday Is A School Day

Beware of what we adjust. As we have demonstrated, adjusting for a collider variable can lead to a false estimate in your analysis. If a collider is included in your model, relying solely on AIC/BIC for model selection may provide misleading results and give you a false sense of achievement.

https://www.kenkoonwong.com/blog/collider_adjustment/

joscani commented 1 year ago

You can adjust for a collider if you fit the complete dag.Richard McElreath call this "full luxury bayes" You can see an example in https://elevanth.org/blog/2021/06/29/regression-fire-and-dangerous-things-3-3/ Or in my blog. in spanish (sorry) https://muestrear-no-es-pecado.netlify.app/2022/02/09/collider-bias/

kenkoonwong commented 1 year ago

You can adjust for a collider if you fit the complete dag.Richard McElreath call this "full luxury bayes" You can see an example in https://elevanth.org/blog/2021/06/29/regression-fire-and-dangerous-things-3-3/ Or in my blog. in spanish (sorry) https://muestrear-no-es-pecado.netlify.app/2022/02/09/collider-bias/

@joscani, I agree with the statement that adjusting for the full DAG would determine if X and Y are d-separated. However, based on the current DAG provided, I'm uncertain if we can achieve that. Happy to discuss! Thanks for the other blog recommendations; I'll review them as well!

joscani commented 1 year ago

Great. I try to estimate full dag . Here is my code using your data


library(cmdstanr)
library(rethinking)

dat <- list(
    Y = df$y,
    X = df$x,
    collider = df$collider,
    W = df$w,
    Z = df$z
)

set.seed(155)

flbi <- ulam(
    alist(
        # Y model
        Y ~ normal( mu_y , sigma_y),
        mu_y <- a_y + bx*X + bw_y * W ,
        # X model
        X ~ normal(mu_x, sigma_x),
        mu_x <- a_x + bz * Z + bw_x * W,

        # collider model
        collider ~ normal( mu_collider , sigma_collider ),
        mu_collider <- a_collider + bx_collider * X + by * Y,

        # priors
        c(a_y,bx, bw_y, a_x,bz, bw_x, a_collider, bx_collider , by) ~ normal( 0 , 1 ),
        c(sigma_y, sigma_x,sigma_collider) ~ exponential( 1 )

    ), data=dat , chains=4 , cores=4 , warmup = 500, iter=2500 , cmdstan=TRUE )

precis(flbi)
> precis(flbi)
                mean   sd  5.5% 94.5% n_eff Rhat4
by             -0.39 0.03 -0.43 -0.34  7139     1
bx_collider    -0.47 0.03 -0.52 -0.42  7460     1
a_collider     -0.02 0.03 -0.07  0.03  9933     1
bw_x            0.22 0.03  0.17  0.27  7873     1
bz              0.54 0.04  0.48  0.60  8013     1
a_x             0.02 0.03 -0.04  0.07  9219     1
bw_y            0.42 0.03  0.37  0.46  9005     1
bx              0.51 0.03  0.47  0.56  9059     1
a_y             0.02 0.03 -0.03  0.07  8875     1
sigma_collider  0.99 0.02  0.95  1.02 10267     1
sigma_x         1.03 0.02  1.00  1.07 10208     1
sigma_y         1.04 0.02  1.00  1.08  9396     1

The model recover correct effect of X over y even condition by collider. Condition in bayesian way, estimating full dag.
If I omit X model (without model Z and W effect on X).

flbi_without_X_model <- ulam(
    alist(
        # Y model
        Y ~ normal( mu_y , sigma_y),
        mu_y <- a_y + bx*X + bw_y * W ,

        # collider model
        collider ~ normal( mu_collider , sigma_collider ),
        mu_collider <- a_collider + bx_collider * X + by * Y,

        # priors
        c(a_y,bx, bw_y, a_collider, bx_collider , by) ~ normal( 0 , 1 ),
        c(sigma_y,sigma_collider) ~ exponential( 1 )

    ), data=dat , chains=4 , cores=4 , warmup = 500, iter=2500 , cmdstan=TRUE )

precis(flbi_without_X_model)


> precis(flbi_without_X_model)
                mean   sd  5.5% 94.5% n_eff Rhat4
by             -0.39 0.03 -0.43 -0.34  6008     1
bx_collider    -0.47 0.03 -0.52 -0.42  6011     1
a_collider     -0.02 0.03 -0.07  0.03  9836     1
bw_y            0.42 0.03  0.37  0.46  7726     1
bx              0.51 0.03  0.47  0.56  7861     1
a_y             0.02 0.03 -0.03  0.07  9319     1
sigma_collider  0.99 0.02  0.95  1.02 10974     1
sigma_y         1.04 0.02  1.00  1.08  8354     1```

This looks great. Thanks for your post. It allows me check full luxury bayes conditioning by collider
kenkoonwong commented 1 year ago

Interesting. I'll have to take a deeper dive in this code. I may have more questions, mainly want to learn more about rethinking package. I have the book and going through now. If I have more questions what is the best way to ask you? Twitter? Thanks for sharing!

joscani commented 1 year ago

Ok. Twitter is good. @joscani is my account. Rethinking course in GitHub l and YouTube videos are cool .

kenkoonwong commented 1 year ago

perfect. thanks for sharing! I've followed you on Twitter