hetio / hetnetpy

Hetnets in Python (relocated from dhimmel/hetio)
https://het.io/software
Other
91 stars 28 forks source link

Permuting Specific Genes Option #16

Open gwaybio opened 6 years ago

gwaybio commented 6 years ago

I am constructing hetnets in https://github.com/greenelab/interpret-compression and am looking to generate network permutations to use as a null distribution. I am running into the issue of potentially inflated z-scores (see example of similar procedure). Could the inflation be because there are many more genes in the network than what I am comparing against?

For example, if cell 6 doesn't include the total population of genes in the hetnet, won't there be an inflation of artificial zeros in the permuted swap? Would this then cause a deflated null distribution in the matrix multiplication in cell 9?

I am wondering if there could be functionality here to permute a hetnet for only certain genes, rather than only certain metaedges. Perhaps adding a variable nodes_to_include that defaults to all nodes would help. Maybe this addition could happen before deciding to loop over nodes here:

https://github.com/dhimmel/hetio/blob/9d9ef1320ee47609e3c61c9f4918531d3c1c8c96/hetio/permute.py#L15-L18

Would it be of interest to add this functionality here?

An alternative (and perhaps an easier alternative) would be to regenerate hetnets in my original scripts to only include genes of interest.

dhimmel commented 6 years ago

Could the inflation be because there are many more genes in the network than what I am comparing against?

Let's take a look at the DWPC, P-DWPC (average permuted DWPC), and Z-DWPC distributions to see if you are experiencing some sort of artefact like you're concerned about.

there are many more genes in the network than what I am comparing against?

I don't think that is a problem. It is correct to permute the entire network even though you'll only be accessing a small portion of the permuted networks. For the XSwap to properly impute, it should be swapping across all relationships of a given type.