m-clark / book-of-models

Spells for everyday living. (also a book coming out in 2024)
https://m-clark.github.io/book-of-models/
Other
55 stars 13 forks source link

causal chapter notes #100

Open malcolmbarrett opened 2 months ago

malcolmbarrett commented 2 months ago

Putting all this in a metaissue. Here is some feedback and notes on the causal chapter:

The two biggest bits of feedback I have:

  1. [x] I feel like you are skirting around the assumptions issue. You keep bringing them up but I don't think you leave the reader knowing what they are. I think the chapter would be improved with a short section on exchangability, positivity, and consistence/SUTVA
  2. [x] I think the chapter flow would be improved if you put the prediction vs explanation revisited section first. I think it sets up the discussion better, and you don't need too many of the concepts introduced

Other notes and suggestions

you say

For example, if you don’t include features that would have a say in how the treatment comes about (confounders),

you say

  • Generalization: When our goal is generalizing to unseen data, the focus is always on predictive performance. This does not mean we can’t use the model to understand the data though, and explanation could possibly be as important
m-clark commented 2 months ago

Many, many thanks for your thoughts on this @malcolmbarrett ! I'll incorporate much of this as I can. You've hit on several things that would definitely improve the chapter. We tried to keep it light, but that definitely risks not going into appropriate detail, so we'll fill out some of those areas, and otherwise rework things a bit. I'll keep you posted on the updates here!

m-clark commented 4 weeks ago

IN PROGRESS (finally)

I edited the original just to add checkboxes, which I'll tick off based on updates in the read-through-3 branch.

m-clark commented 1 week ago

Causal GH issue update. Feel free to check out the improved result. Here are some details.

I think the chapter flow would be improved if you put the prediction vs explanation revisited section first. I think it sets up the discussion better, and you don't need too many of the concepts introduced

Agreed and done. This was initially added thanks to the suggestion of @Dpananos but we'd just tacked it on, and it makes much more sense at the beginning.

I think this (statement on generalization) needs to be modified/clarified because of the literature on generalization and transportability of causal effects which has a different meaning than this.

I added a modifier to generalization (as has been discussed previously) with a footnote describing transportability as follows, which I'm happy to modify further if needed:

In causal modeling, there is the notion of transportability, which is the idea that a model can be used in, or generalize to, a different setting than it was trained on. For example, you may see an effect for one demographic group and wish to know whether it holds for another. It is closely related to the notion of external validity, and is related to the concepts we've hit on in our discussion of interaction (@sec-lm-extend-interactions).

I feel like you are skirting around the assumptions issue. You keep bringing them up but I don't think you leave the reader knowing what they are. I think the chapter would be improved with a short section on exchangability, positivity, and consistence/SUTVA

Added brief discussion and moved confounding demo to this part for demonstration.

I don't think it's clear from the covid example why it would allow you to do those things. I think a little bit about how it's not related to other parts of the causal structure would help.

Added some text that hopefully clarifies this a bit more.

It seems like you are missing an explanation of the DAG at the beginning and that you still have placeholder text there.

That was initially the chapter plot but those are now something else. Moved to the graphical model/sem section, which has be notably updated.

Since you bring up SEM, I often point out to students that DAGs allow you to do the same thing with fewer modelling assumptions. The more paths you model, the higher the risk you've modelled it wrong. With an adjustment set, you may be able to avoid it. Related to https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3530375/

Noted, and with reference in a footnote.

This needs to include outcome. Only affecting the treatment is not a confounder.

typo fixed!

I think this needs to be modified/clarified because of the literature on generalization and transportability of causal effects which has a different meaning than this

Added a bit on transportability in a footnote.

It might be worth noting that your example where you make predictions on the counterfactuals is a simple form of g-computation, since you bring it up already.

Done.

Is it worth noting that ML methods don’t work out of the box? You need doubly robust methods like TMLE (or some bayesian methods) to get the right standard errors

I think I've been harping on issues of uncertainty estimation in other places in the text - there is even now an uncertainty chapter to be broken off from the estimation chapter (not yet pushed), and some explicit discussion on your notion specifically in the 'danger zone'. But after reviewing this chapter, I thought the meta-learner section might be a good place to bring this up, so I put a note there. Unfortunately many that would use ML are not necessarily interested in the uncertainty of the estimates (to their peril), so it's not it's often less of an issue when ML approaches are used. Using ML without uncertainty in estimates or predictions is like bad design, people just get used to not having it.

It feels like you have not set up the exercise to succeed. Maybe you can interweave something about DAGs and adjustment sets or leave enough breadcrumbs

All the exercises were half-assed/placeholders for a while as we weren't sure we were even going include them. But we decided to improve the ones that we had, so I did so here as well. I had some sort of context for the blurb there, but have forgotten it after 6+months. Check the new one and let me know what you think.

Not suggesting you need to include this, but we also have a section on standard methods vs causal specific models

I added just a hint of this in the revised linear regression section, a note that it if we had randomization and no confounding, we could interpret the effect as causal. It also works nicely because I moved the confounding example to the previous section.