owkin / PyDESeq2

A Python implementation of the DESeq2 pipeline for bulk RNA-seq DEA.
https://pydeseq2.readthedocs.io/en/latest/
MIT License
584 stars 62 forks source link

dispersion algorithm #252

Open wangjiawen2013 opened 7 months ago

wangjiawen2013 commented 7 months ago

Hi, It is reported "Dispersion parameters are first estimated independently for each gene by fitting a negative binomial generalized linear model (GLM)" in pydeseq2 bioinformatics paper. Since I am not a statistician, I cannot understand the complicated statistics principle under pydeseq2 and DEA. However, I wanna know which group of samples are used to get dispersion, for example, if the control group contains three samples and the treatment group contains another three samples. Which samples will be used to calculate the dispersion ?