cdt15 / lingam

Python package for causal discovery based on LiNGAM.
https://sites.google.com/view/sshimizu06/lingam
MIT License
380 stars 59 forks source link

DirectLiNGAM Duplicate no_path Prior Knowledge #34

Closed gwinch97 closed 2 years ago

gwinch97 commented 2 years ago

Hi there, thank you for your amazing work on this package. I was just wondering if you could help me understand the intuition of this snipper of code from DirectLiNGAM (lines 116-125). I am trying to understand why it should be the case that duplicate pairs without a path should cancel out and not be ordered, and subsequently not be included in the partial_orders array?

# Check for inconsistencies in pairs without path.
# If there are duplicate pairs without path, they cancel out and are not ordered.
check_pairs = np.concatenate([no_path_pairs, no_path_pairs[:, [1, 0]]])
if len(check_pairs) > 0:
    pairs, counts = np.unique(check_pairs, axis=0, return_counts=True)
    check_pairs = np.concatenate([no_path_pairs, pairs[counts > 1]])
    pairs, counts = np.unique(check_pairs, axis=0, return_counts=True)
    no_path_pairs = pairs[counts < 2]

check_pairs = np.concatenate([path_pairs, no_path_pairs[:, [1, 0]]])
ikeuchi-screen commented 2 years ago

partial_orders array is created to use prior knowledge in DirectLiNGAM. The partial order is used to decide which variables to include as search candidates. For example, if there is a path from X1 to X2, X1 is included in the candidate search, but X2 is not. If X1 is selected from the candidate paths by causal discovery, then X2 is added to the candidate. Two variables with no_paths specified in both directions are not included in the partial order because there is no ordering relationship between them.

gwinch97 commented 2 years ago

Thank you for your response. This makes complete sense, I think what is confusing me is perhaps instead the next block of code that ignores the prior knowledge if the length of path_pairs and no_path_pairs concatenated is zero:

if len(check_pairs) == 0:
    # If no pairs are extracted from the specified prior knowledge, 
    # discard the prior knowledge.
    self._Aknw = None
    return None

In the scenario where you apply prior knowledge strictly with a symmetrical NxN matrix of only 0's and -1's, no paths and no prior, it means that the prior knowledge is always discarded entirely. Is this the desired behaviour?

ikeuchi-screen commented 2 years ago

Thank you for your reply. You're right, the code that ignores prior knowledge when the length of check_pairs is zero is no good.

gwinch97 commented 2 years ago

I'm glad I could be of help! I'm pleased that this was not the desired behaviour as it was causing all sorts of problems, thank you once again for your prompt responses, and I look forward to an updated version in the future.

ikeuchi-screen commented 2 years ago

Thank you for making the issue. I'll fix it in the next version.