r-causal / causal-inference-in-R

Causal Inference in R: A book!
https://www.r-causal.org/
197 stars 51 forks source link

"Burying the lede" -- 03-counterfactuals:Causal Assumptions Simulation:Exchangeability #267

Open lindbrook opened 1 month ago

lindbrook commented 1 month ago

Original:

  1. Exchangeability: We assume that within levels of relevant variables (confounders), exposed and unexposed subjects have an equal likelihood of experiencing any outcome prior to exposure; i.e. the exposed and unexposed subjects are exchangeable. This assumption is sometimes referred to as no unmeasured confounding.

I think the "i.e.," bit might make a better opening. Something along the lines of:

  1. Exchangeability: Subjects are interchangeable. Prior to exposure and across all values of variables (including confounders), all subjects are equally likely to reach all possible outcomes. This assumption is sometimes referred to as no unmeasured confounding.

Actually, I'm not entirely sure how confounders fit into this paragraph - I'm guessing you mean all variables including confounders, not just confounders.

malcolmbarrett commented 1 month ago

Thanks for your suggestions here and elsewhere. I will take them into account when I revisit this chapter soon

Actually, I'm not entirely sure how confounders fit into this paragraph

Exchangeability applies to confounding pathways (and other backdoor paths). You can be unbalanced in, say, causes of the outcome that are not causes of the exposure and still have exchangeability for the problem at hand (even though, in this case, including the causes of the outcome will improve precision)

lindbrook commented 1 month ago

For me, the problem is not with the role of confounding variables (though the details you provide are probably worthy of a footnote or a cite/link to where you do discuss this) but with the sentence.

In the original text "relevant variables (confounders)" reads like an "i.e.," or an "AKA" and "relevant variables" are equivalent to "confounders". They are one and the same.

I'm guessing this is not what you want to say. If I'm right, it's something more like "relevant variables" include causal and confounding variables. If so, simply adding an "and" or an "including" would make this clearer:

"relevant variables (and confounders)" or "relevant variables (including confounders)"

Hope this is useful.

malcolmbarrett commented 1 month ago

The real problem here is that this sentence is making you think we mean something other than what we say. I agree it needs to be reworked, but to be clear, the relevant factors are the confounders, at least for a lot of problems. What you need for exchangability is no open backdoor paths, which mostly means no confounding. We expand on this later and agree we need a cross-reference to it.

lindbrook commented 1 month ago

OK. Hence why the assumption is sometimes called no unmeasured confounding. I guess I got tripped up by the adjective "relevant" (causal variables are not relevant here).

One last question. Does this then mean that the positivity assumption addresses the interchangeability of subjects with respect to causal variables?

malcolmbarrett commented 1 month ago

Yes so your feedback is still helpful because the wording is confusing, so thanks.

For positivity, there should be a non-zero chance of treatment for both groups. But yes, the subtlety is that when you're accounting for confounders, it needs to be non-zero for all combinations of the confounders. That is often violated in practice by chance because of the curse of dimensionality but we hope we can smooth that over will parametric modeling.

lindbrook commented 1 month ago

Just to be clear. The positivity assumption addresses the interchangeability of subjects with respect to both causal and confounding variables, right?


"Consistency": If this is the term/jargon used in the literature, maybe you're stuck with it. To me, it seems the mathematical sense you describe is stronger than the opening sentence indicates or the word "consistency" implies. The opening sentence reads like the welcome advice that the question you're trying to answer is the question your analysis actually addresses. But the mathematical bit reads like: given the exposure received by the subjects, the observed outcome should be the same as the potential outcome.

"Well defined exposure": Is this a property of the subjects or of the treatments? It seems like that latter, but you also include: "there is no difference between subjects in the delivery of that exposure". Unless it's both, I think this could be clearer.

"No interference": Regarding the numerical example. I guess it makes sense that because causality takes on an "omniscient narrator" perspective, the numerical example (+2 for my partner when they get a different flavor) expresses interference. But because my happiness/utility is not affected, I'm wondering if this example is a bit too subtle. Why not use more symmetry? Both my happiness and that of partner are affected (my partner becomes happier when we get different flavors but I become unhappier). Or is this the point?

"Exchangeability" sounds like it's a property of the subjects, not the treatments - let alone something that's limited to confounders and excludes real causes.

Finally, you could probably state at the outset that in your study, you only get one treatment/serving (one flavor) rather than getting both (e.g., a spoonful of each).