satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
205 stars 33 forks source link

Inclusion of variable not to regress out #38

Open kvittingseerup opened 5 years ago

kvittingseerup commented 5 years ago

Thanks for a very interesting paper and very useful tool. Is there a way to include factors not to regress out (e.g. like limma::removeBatchEffect()? An example could be I have data from 20 patients from two treatment groups, where I would like to regress out the effect of the individual (batch) but not the effect of the treatment. If there is no way of specifying the treatment would sctransform not also remove (some of) the treatment effect?

ChristophH commented 4 years ago

Hi, currently there is no such parameter in the model, but it is an interesting concept that would be useful. I have to think about how we could implement this in our regularization-framework.

In your example regressing out batch would indeed also remove the treatment effect. However, if batch effects are random and you have enough batches per treatment (10 in this example) you might not have to correct for batch. An alternative approach would be to run an integration of all batches as outline in this Seurat vignette.

kvittingseerup commented 4 years ago

It would also be useful for cases where the cell-type distribution between samples are very skewed.

Unfortunately I have just experienced removing a lot of my signal of interest even though I had 10+ samples in each group using the Seurat:: SCTransform()

Is the expression matrix generated after running Seurat::IntegrateData ( + Seurat:: NormalizeData() ?) useable for subsequent gene-level analysis? I thought i was only usable for sample-level analysis?