theislab / diffxpy

Differential expression analysis for single-cell RNA-seq data.
https://diffxpy.rtfd.io
BSD 3-Clause "New" or "Revised" License
179 stars 23 forks source link

Is there a way to "constrain" specific constraints to be similar? #197

Closed Hrovatin closed 3 years ago

Hrovatin commented 3 years ago

I have the following problem: Studies (each composed of multiple samples): image Continuous process P: image Distribution of P across samples. image

If I fit only ~1+P I get genes that may be expressed in only part of the studies in P higj/low region. Image of top downregulated genes sorting by lfc and then padj. All but the 2nd and 9th gene seem to be such examples. image

As studies/samples confound with P I can not use those simply as covariates. Thus I was thinking of binning P into lets say 10 bins and constraining studies within each bin. However, I will have cells from the same study in different bins. So if the same base level was used in each bin for constraints then the constraint coefficients for the same study across bins should be similar. Is there a way to enforce this?

Hrovatin commented 3 years ago

Never-mind, I have just realised that doing constraints in this way produces design matrix that is not full rank.

Also, doing what I proposed would likely require optimising a parreto front of both coef_study_bin within bin across studies and within study across bins. (Not sure about this)