[Feature Request]: Analysis of causal diagrams (DAGs)

JoKeyser commented 7 months ago

Description

Statistics benefits from causal reasoning. One important tool are causal diagrams, a.k.a. directed acyclic graphs (DAGs), a.k.a. causal Bayesian networks.

Purpose

JASP should make it easy to create, edit, plot, and analyze causal diagrams. Causal analyses should include things like conditional (in)dependencies, adjustment sets, instrumental variables, etc.

Use-case

Improve inference by making causal assumptions explicit and getting their logical implications. For example, for regression analyses:

What confounds may jeopardize my regression analysis?
What adjustment set do I need to arrive at a certain causal estimate?
Or conversely, what variables must be excluded from my regression, to avoid collider bias or post-treatment bias?

Is your feature request related to a problem?

From my perspective, causal inference is not yet taught enough, nor accessible enough, especially for beginners and speakers of other languages than English. JASP is in a good position to improve all that.

Is your feature request related to a JASP module?

Other

Describe the solution you would like

A JASP module around the R package dagitty would go a long way. See https://dagitty.net/ for more information, including a web version and background materials.

Describe alternatives that you have considered

There may be better(-suited) R packages that I don't know about. Supposedly, it would be even better to integrate causal implications into statistical modules, like regression, but I imagine this would be much more complicated.

Additional context

No response

TarandeepKang commented 7 months ago

Hello, I just wanted to support this request and supplement it by suggesting a Bayesian counterpart as implemented in bnlearn :

Scutari M (2017). “Bayesian Network Constraint-Based Structure Learning Algorithms: Parallel and Optimized Implementations in the bnlearn R Package.” Journal of Statistical Software, 77(2), 1–20. (https://doi.org/10.18637/jss.v077.i02).

Scutari M (2010). “Learning Bayesian Networks with the bnlearn R Package.” Journal of Statistical Software, 35(3), 1–22. [doi:10.18637/jss.v035.i03] Which can be fruitfully used for example as here:

Bathelt, J., Geurts, H. M., & Borsboom, D. (2022). More than the sum of its parts: Merging network psychometrics and network neuroscience with application in autism. Network Neuroscience, 6(2), 445–466. https://doi.org/10.1162/netn_a_00222

Chen, C., Cao, C., Fang, R., Wang, L., & Borsboom, D. (2024). Revealing the psychopathological pathway linking trauma to post-traumatic stress disorder: Longitudinal network approach. BJPsych Open, 10(1), e2. https://doi.org/10.1192/bjo.2023.615

although, of course, DAGs are very broadly applicable:

Kunicki, Z. J., Smith, M. L., & Murray, E. J. (2023). A Primer on Structural Equation Model Diagrams and Directed Acyclic Graphs: When and How to Use Each in Psychological and Epidemiological Research. Advances in Methods and Practices in Psychological Science, 6(2), 25152459231156085. https://doi.org/10.1177/25152459231156085

Tennant, P. W. G., Murray, E. J., Arnold, K. F., Berrie, L., Fox, M. P., Gadd, S. C., Harrison, W. J., Keeble, C., Ranker, L. R., Textor, J., Tomova, G. D., Gilthorpe, M. S., & Ellison, G. T. H. (2021). Use of directed acyclic graphs (DAGs) to identify confounders in applied health research: Review and recommendations. International Journal of Epidemiology, 50(2), 620–632. https://doi.org/10.1093/ije/dyaa213

tomtomme commented 7 months ago

@JoKeyser @TarandeepKang I assigned julius as the maintainer of SEM. Or is this a better fit for the Network-Analysis module?

TarandeepKang commented 7 months ago

Hi Thomas/Dr Langkamp,

For what my opinion is worth, I would say I've seen these diagrams used in regression, SEM and in network models, as I mentioned above, so I would tag the people responsible for those. And maybe also EJ because this seems like something that might need some general oversight et cetera. As Jo mentioned above, there is a variety of packages available, I mention another one I've used before:

Breitling, L. P., Duan, C., Dragomir, A. D., & Luta, G. (2021). Using dagR to identify minimal sufficient adjustment sets and to simulate data based on directed acyclic graphs. International Journal of Epidemiology, 50(6), 1772–1777. https://doi.org/10.1093/ije/dyab167

And a pretty helpful review (albeit that doesn't include Bayesian options) Pitts, A. J., & Fowler, C. R. (2024). Comparison of open-source software for producing directed acyclic graphs. Journal of Causal Inference, 12(1). https://doi.org/10.1515/jci-2023-0031

Best,

Tarandeep

JoKeyser commented 7 months ago

Thanks @TarandeepKang and @tomtomme for chiming in, I agree that it's a good idea to first formulate a strategy what we want and how to get there, from the top-most level.

@TarandeepKang thank you for all the references; I like how Kunicki, Smith, & Murray (2023) characterize the difference of DAGs and SEMs, citing their abstract:

In brief, SEM diagrams are both a conceptual and statistical tool in which a model is drawn and then tested, whereas causal DAGs are exclusively conceptual tools used to help guide researchers in developing an analytic strategy and interpreting results. Causal DAGs are explicitly tools for causal inference, whereas the results of a SEM are only sometimes interpreted causally.

My request is about causal DAGs as conceptual tools, much less as yet-another statistical tool. There seem to be ways to combine the two, and it seems that bnlearn is one such extension(?). @TarandeepKang , could you maybe summarize the difference between the R packages we collected so far, you seem to have a good overview?

With dagitty, the focus is on that conceptual side, helping the analyst to reason about the "causal side" of things, independent of the statistical method. That's why I think it would be good start as a new JASP module, without integration into statistical modules.

Edit: Maybe to clarify, here is a lecture by Prof. McElreath with some (minimal) examples that illustrate the kind of causal reasoning that I think should be supported by JASP: Statistical Rethinking, lecture 6, "Good and Bad Controls", video on youtube, slides.

TarandeepKang commented 7 months ago

Ah, OK Jo, I now better understand what you were trying to get at by opening this, so I will make a separate request for the network stuff, and to clarify the differences between the two packages I mentioned above I would summarise the points made in the paper I mentioned and say: DagR has better simulation capabilities than DAGitty because you can model a mix of binary and continuous variables within the same DAG. Unlike DAGitty, dagR does not provide functions to identify conditional independencies, and cannot directly identify descendants. So ideally, we would find some way to combine these functionalities into the same module.

tomtomme commented 7 months ago

@TarandeepKang @JoKeyser Thx Jo and Tarandeep for the insights. Cheers, Thomas (no Dr. Langkamp needed :D)

tomtomme commented 7 months ago

@TarandeepKang @JoKeyser This seems to be a duplicate of https://github.com/jasp-stats/jasp-issues/issues/118 Can you confirm?

Related HowTo for DoodleBUGS https://www.math.kit.edu/stoch/lehre/abib2010w/media/doodle.pdf

JoKeyser commented 7 months ago

@tomtomme :

@TarandeepKang @JoKeyser This seems to be a duplicate of #118 Can you confirm?

No, #118 talks about a graphical UI to define statistical models, this issue here is about causal analysis. There may be ways to combine both, but I would consider that at least a few steps down the road, if at all.

jasp-stats / jasp-issues