davidaknowles / leafcutter

Annotation-free quantification of RNA splicing. Yang I. Li, David A. Knowles, Jack Humphrey, Alvaro N. Barbeira, Scott P. Dickinson, Hae Kyung Im, Jonathan K. Pritchard
http://davidaknowles.github.io/leafcutter/
Apache License 2.0
208 stars 115 forks source link

Creating co-splicing networks with leafcutter output? #225

Open JeGrundman opened 1 year ago

JeGrundman commented 1 year ago

Hi,

I hope all is well, and thank you for providing this software. This isn't really a bug, but more of a general question about how one would go about creating a co-splicing network, akin to gene co-expression networks created with WGCNA, with LeafCutter output, as done in this paper on which Dr. Li is a co-author: https://www.nature.com/articles/s41588-018-0238-1#Sec1.

From papers I've seen, when creating a gene co-expression network with WGCNA, the raw gene counts are normalized and adjusted for different technical covariates using a linear model. However, in the differential splicing protocol outlined by LeafCutter, I see that the raw counts are processed through a Dirichlet-multinomial model, where you can also correct for covariates before the intron cluster counts are subjected to a likelihood ratio test. In the paper I linked above, I don't see details on whether raw counts or intron usage ratios were used for network construction.

My question is, if I wanted to use covariate-adjusted, normalized counts for WGCNA, what would be the correct input file and analysis pathway to obtain those counts? It seems from the LeafCutter workflow that these counts shouldn't be processed in the same way that other gene expression data would be. If so, would the proper protocol be to use the raw counts (the perind_numers.counts.gz file produced in the differential splicing workflow) and adjust them with the Dirichlet-multinomial model or with a linear model?

Thanks!