How to use the initial DAG in the RESIT sample code?

cdt15 / lingam

Python package for causal discovery based on LiNGAM.

https://sites.google.com/view/sshimizu06/lingam

MIT License

356 stars 54 forks source link

How to use the initial DAG in the RESIT sample code? #135

Closed FTamas77 closed 1 month ago

FTamas77 commented 2 months ago

For example, we have this code here: https://github.com/cdt15/lingam/blob/master/docs/tutorial/resit.rst

In that, we create an array:

m = np.array([ [0, 0, 0, 0, 0], [1, 0, 0, 0, 0], [1, 1, 0, 0, 0], [0, 1, 1, 0, 0], [0, 0, 0, 1, 0]])

But we never use it. I am confused because the adjacency matrix contains only 1 after the fitting. Could you help me to clarify the input/output of that code?

ikeuchi-screen commented 2 months ago

Hi @FTamas77 ,

The m is an array that we are creating to draw the true DAG, so we do not use it to run RESIT.

Like the inputs and outputs of other algorithms, the dataset is the input and the adjacency matrix is the output. However, because of RESIT's assumption of nonlinear functions, the adjacency matrix will have a value of 0 or 1.

FTamas77 commented 1 month ago

@ikeuchi-screen Thank you very much for the clarification. So, is it correct if I say that one means an existing correlation and zero means no correlation between the variables (in the adjacency matrix that is calculated during the fitting)?

And I guess the model.bootstrap(X, n_sampling=n_sampling) is good if I want to quantify this causal effect, isn't it?

ikeuchi-screen commented 1 month ago

@FTamas77 One in the adjacency matrix represents an edge in the causal graph, while zero represents no edge. Edges denote causation rather than correlation.

Because the relationship between variables is nonlinear, even model.bootstrap cannot estimate causal effects. Bootstrapping can estimate the probability of edges.

FTamas77 commented 1 month ago

Thank you very much for the clarification.