drisso / zinbwave

Clone of the Bioconductor repository for the zinbwave package, see https://bioconductor.org/packages/zinbwave
43 stars 10 forks source link

Why are coefficients for feature covariates sample-dependent? #74

Open clemenskreutz opened 1 year ago

clemenskreutz commented 1 year ago

Dear Davide.

We recently came across ZINB-WaVE and find the approach very convincing. Moreover, the approach is very clearly coded.

However, one thing I don't understand is why by default the coefficients for the feature covariates (e.g. gamma_mu) are different for the samples. In other words: Why does gamma_mu have dimension nColV x nSamples and not nColvV x 1?

Example:

Ns <- 20   # number of samples/cells/columns of the count data
Nf <- 100 # number of features/RNAs/rows of the count data
Nv <- 3 # number of feature covariates

X <- matrix(rep(1,times=Ns),ncol=1)
V <- matrix(rep(1,times=Nv*Nf),nrow=Nf)
z <- zinbModel(X=X,V=V)
dim(z@V)   # has dimension Nf x Nv
[1] 100   3
dim(z@gamma_mu) # has dimension Nv x Ns: Why not only Nv x 1 ?
[1]  3 20

The examples for covariates of features in the paper (e.g. gene length or GC content) are not dependent on the sample. Only in rare cases (e.g. adjustment of the cell cycle genes for different cell cycle stage) one would need a sample-dependence.

How do you think? Making coefficients gamma_mu and gamma_pi independent on the sample could reduce the total number of parameters and could reduce the requirement for penalization.

Best regards, Clemens.