SystemsGenetics / KINC

Knowledge Independent Network Construction
MIT License
11 stars 4 forks source link

Philosophical thought about the use of "Source" and "Target" labels #178

Open JohnHadish opened 3 years ago

JohnHadish commented 3 years ago

KINC outputs a "tidy" network that has edges containing two genes. Gene 1 is a "Source" edge and gene 2 is a "Target" edge.

The labels "Source" and "Target" imply causality, but KINC is not capable of predicting causality.

From: Mercatelli, D., Scalambra, L., Triboli, L., Ray, F., & Giorgi, F. M. (2020). Gene regulatory network inference resources: A practical overview. Biochimica et Biophysica Acta - Gene Regulatory Mechanisms, 1863(6), 194430. https://doi.org/10.1016/j.bbagrm.2019.194430

"GRN reconstruction methods based on coexpression (see later) provide a network of relationships with no information on the directionality of the identified interactions. In other words, the correlation between two gene expression levels does not imply a specific causal relationship between the two, nor a direct one."

spficklin commented 3 years ago

Yes, you're right. If I remember correctly, these are named 'Source' and 'Target' for easy compatibility with Cytoscape, which I think expects the columns to be named this. The column that specifies the relationship as 'co' (i.e co-expression) implies no causality.

spficklin commented 3 years ago

I should add, it would be a lot to change it now as many of our downstream scripts expect it. It would break a lot to change it now....

JohnHadish commented 3 years ago

Yeah, I agree that it would be rather destructive to change things, and there honestly isnt a better naming convention besides "gene1" "gene2", which would probably have the same issue of implied causality.

We might want to put a little note in the documentation about this under the "Plain-text Output Files: Network File" so that users are informed that co-expression does not imply causality.

Suggested addition:

The KINC plain-text output files indicate a "Source" and  "Target" gene, but this should not be thought of as causality. These labels are used to make it easy to transfer networks into other programs such as cytoscape. KINC uses co-expression, which does not imply causality. 
spficklin commented 3 years ago

Yep, I think that's good. Can you make that change? It should be easy with the 'Edit on GitHub' link at the top of the RTD page.

Some edits:

The KINC tab-delimited output files (plain, full and tidy) use "Source" and "Target" columns which may seem to imply causality. These column labels make integration with other programs, such as Cytoscape easier but should not be considered as implying causality, only correlation.

KINC can be used for more than just genes and co-expression (i.e. metabolites/proteins and co-abundance) so I took those phrases out.