jpvert / tigress

Trustful Inference of Gene REgulation using Stability Selection
4 stars 6 forks source link

Accounting for non-normality #6

Open OceaneCsn opened 3 years ago

OceaneCsn commented 3 years ago

Hello, thank you for developing TIGRESS, that is such an interesting framework. As linear regression makes assumptions about data normality, do you know if it can be safely used on RNA-Seq counts that are better fit by Poisson or negative binomial distributions? Would it be preferable to log transform the data before running TIGRESS?

Out of curiosity, have you ever tried to replace the lars function in the code by a function fitting sparse generalized linear models such as dglars, to model expression counts as a Poisson distribution for example?

Best regards

jpvert commented 3 years ago

Thanks for your comments! As you correctly point out, the regression model implemented is more adapted to data with Gaussian noise, so for RNA-seq data I would suggest to log-transform first (more precisely, something like log(count+1)). We did not try to replace the lars function by a more general sparse generalized linear model, but it makes complete sense to model count data. Cheers

OceaneCsn commented 3 years ago

Thank you very much for your quick answer! Maybe I'll try to see if I can adapt the part of the code with lars in that direction. Cheers