py-why / causal-learn

Causal Discovery in Python. It also includes (conditional) independence tests and score functions.
https://causal-learn.readthedocs.io/en/latest/
MIT License
1.04k stars 174 forks source link

How to plot GES Causal Paths #137

Closed MattFill closed 8 months ago

MattFill commented 9 months ago

I have been experimenting with the GES algorithm to estimate causal paths from cross-sectional data. The following code produces the graph attached. I am wondering how I can estimate the causal direction between nodes from GES and visualize this?

Set up

from causallearn.search.ScoreBased.GES import ges

data = df_dx[['Bio','Psycho','Dx','NumberChronicPainTypes_T0']]

default parameters

Record = ges(data)

or customized parameters

scoring = 'local_score_BIC' maxP = None # maximum number of parents when searching the graph parameters = None Record = ges(data, scoring, maxP, parameters)

Visualization using pydot

from causallearn.utils.GraphUtils import GraphUtils import matplotlib.image as mpimg import matplotlib.pyplot as plt import io

pyd = GraphUtils.to_pydot(Record['G'],labels=['Bio','Psycho','Dx','NumberChronicPainTypes_T0']) tmp_png = pyd.create_png(f="png") fp = io.BytesIO(tmp_png) img = mpimg.imread(fp, format='png') plt.axis('off') plt.imshow(img) plt.show()

or save the graph

pyd.write_png('simple_test.png')

simple_test

kunwuz commented 9 months ago

Hi, thanks for the question. The output of GES is a Markov Equivalence Class, which means that there may exist edges of which the directions cannot be determined by the algorithm. It seems that the algorithm cannot determine any direction for your data.

If you would like to get direction for every edge, perhaps you may consider functional-constraint-based methods, such as LiNGAM. A usage case of LiNGAM can be found in this notebook.

MattFill commented 9 months ago

Hi, thanks for the question. The output of GES is a Markov Equivalence Class, which means that there may exist edges of which the directions cannot be determined by the algorithm. It seems that the algorithm cannot determine any direction for your data.

If you would like to get direction for every edge, perhaps you may consider functional-constraint-based methods, such as LiNGAM. A usage case of LiNGAM can be found in this notebook.

Thank you for your comment. I've tried the LiNGAM model and I do indeed obtain a directed graph. I am wondering if these models permit bi-directional paths between nodes (i.e., a causal path from bio -> psycho as well as psycho -> bio)?

Screenshot 2023-10-06 at 12 50 50 PM
kunwuz commented 9 months ago

No, LiNGAM only returns a directed acyclic graph. A bidirectional path as you mentioned will introduce cycles. FCI permits bi-directional edges, which may be useful.

MattFill commented 9 months ago

No, LiNGAM only returns a directed acyclic graph. A bidirectional path as you mentioned will introduce cycles. FCI permits bi-directional edges, which may be useful.

I see. It seems that the FCI algorithm can't determine any direction for my data. I suppose this is inherent in the structure of my data that I will need to investigate. Thank you for your help!

image

jdramsey commented 9 months ago

Sorry to interject--do you know if your data is Gaussian, non-Gaussian, linear, nonlinear, etc.?

MattFill commented 9 months ago

Two of my variables are Gaussian (Bio and Psycho), numberofpaintypes is ordinal and Dx is binary. Why do you ask?

Get Outlook for iOShttps://aka.ms/o0ukef


From: Joseph Ramsey @.> Sent: Friday, October 6, 2023 2:08:19 PM To: py-why/causal-learn @.> Cc: Matt Fillingim @.>; Author @.> Subject: Re: [py-why/causal-learn] How to plot GES Causal Paths (Issue #137)

Sorry to interject--do you know if your data is Gaussian, non-Gaussian, linear, nonlinear, etc.?

— Reply to this email directly, view it on GitHubhttps://github.com/py-why/causal-learn/issues/137#issuecomment-1751206168, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARJLA3FWVCJL2WJUWRLK233X6BCJHAVCNFSM6AAAAAA5V5LLF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJRGIYDMMJWHA. You are receiving this because you authored the thread.Message ID: @.***>

jdramsey commented 9 months ago

I assume you're treating the discrete variables as continuous since I don't think causal-learn has a mixed score or test yet. (We were supposed to do that but I don't think it's happened yet.)

With two Gaussian variables and two categorical variables, there is theoretically no way to determine the directions of edges in a CPDAG that aren't already oriented in the CPDAG.

That's why I was wondering.

On Fri, Oct 6, 2023 at 2:47 PM Matt Fillingim @.***> wrote:

Two of my variables are Gaussian (Bio and Psycho), numberofpaintypes is ordinal and Dx is binary. Why do you ask?

Get Outlook for iOShttps://aka.ms/o0ukef


From: Joseph Ramsey @.> Sent: Friday, October 6, 2023 2:08:19 PM To: py-why/causal-learn @.> Cc: Matt Fillingim @.>; Author @.> Subject: Re: [py-why/causal-learn] How to plot GES Causal Paths (Issue

137)

Sorry to interject--do you know if your data is Gaussian, non-Gaussian, linear, nonlinear, etc.?

— Reply to this email directly, view it on GitHub< https://github.com/py-why/causal-learn/issues/137#issuecomment-1751206168>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/ARJLA3FWVCJL2WJUWRLK233X6BCJHAVCNFSM6AAAAAA5V5LLF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJRGIYDMMJWHA>.

You are receiving this because you authored the thread.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/py-why/causal-learn/issues/137#issuecomment-1751262530, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACLFSRZJDPB2BY5WZ47LV3TX6BG4RAVCNFSM6AAAAAA5V5LLF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJRGI3DENJTGA . You are receiving this because you commented.Message ID: @.***>

kunwuz commented 9 months ago

Yeah, it would be good to have a mixed score or test. I know that there is a team that is interested in contributing a test for mixed-type data. Perhaps the work will start soon :)

BTW, let me know if anyone would like to contribute more scores or tests for mixed cases (maybe degenerate Gaussian?). Right now the only solution in causal-learn might be approximation using Kernel-based methods with small kernel width.

MattFill commented 9 months ago

I see. Thank you for the insight.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Yujia Zheng @.> Sent: Friday, October 6, 2023 3:17:26 PM To: py-why/causal-learn @.> Cc: Matt Fillingim @.>; Author @.> Subject: Re: [py-why/causal-learn] How to plot GES Causal Paths (Issue #137)

Yeah, it would be good to have a mixed score or test. I know that there is a team that is interested in contributing a test for mixed-type data. Perhaps the work will start soon :)

BTW, let me know if anyone would like to contribute more scores or tests for mixed cases (maybe degenerate Gaussian?). Right now the only solution in causal-learn might be approximation using Kernel-based methods with small kernel width.

— Reply to this email directly, view it on GitHubhttps://github.com/py-why/causal-learn/issues/137#issuecomment-1751296446, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ARJLA3FN6LMYLTNQ43JOBZDX6BKMNAVCNFSM6AAAAAA5V5LLF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONJRGI4TMNBUGY. You are receiving this because you authored the thread.Message ID: @.***>