satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
210 stars 33 forks source link

sctransform in the context of an Integration workflow #3

Closed roosheelpatel closed 5 years ago

roosheelpatel commented 5 years ago

Hi Christoph, I was wondering if you could provide some insight on how to apply the sctransform method in the context of integrating multiple batches. I tested the scheme out on my current integration scheme, where I apply the sctransform twice. Once before the IntegrateData, to find the variable genes to do the integration on and then again after to perform the scaling.

Upon visualizing the results of this analysis, I noticed that my clusters are being clearly seperated by batch(i.e. same cell type, being split by batch).

Was my intuition/workflow incorrect? If so, what are the correct decisions to make with sctransform in the context of an integration scheme.

Thanks!

roosheelpatel commented 5 years ago

Ahh, I figured it out digging into the function.

For those who run into the same issue, the SCTransform command contains an argument called 'batch_var' that you can set.

Thanks for developing the tool and excited to test it out!

satijalab commented 5 years ago

Please see issue #4 for a more detailed discussion on how to incorporate sctransform into a Seurat v3 integration analysis. Thanks!

ChristophH commented 5 years ago

Also, note that using the batch indicator variable in sctransform::vst does not replace an integration analysis as implemented in Seurat. batch_var can be used if you are working with technical, or biological replicates of the same system, where global trends shift genes and batches contain roughly the same cell populations. In other cases, we suggest you use the integration methods of Seurat v3.

We are in the process of putting together a vignette for how to combine sctransform with Seurat v3 integration. See #4