carmonalab / STACAS

R package for semi-supervised single-cell data integration
GNU General Public License v3.0
75 stars 9 forks source link

Seurat Normalization #2

Closed kvittingseerup closed 3 years ago

kvittingseerup commented 3 years ago

Thanks for the nice tutorials. In there you use NormalizeData(). Do you have experience with using SCTransform() instead as suggested int the SCTransform workflow here which privides "improved pre-processing and normalization"?

mass-a commented 3 years ago

Hello Kristoffer, thanks for the post.

In version 1.0 of STACAS we opted for a standard log-transformation to normalize the data, which is very simple and works well in our hands. It is true that in some cases (e.g. when there is a large heterogeneity in sequencing depth) the SCTransform may be beneficial. We may add the SCT normalization in the next update to the method, but haven't implemented this yet!

kvittingseerup commented 3 years ago

Is STACAS doing normalisation within the functions? I was just refering to the tourial where I can just replace NormalizeData() with SCTransform() - or will that break something?

mass-a commented 3 years ago

Well, it's a bit more complicated than that.

The SCT requires a slightly different workflow than the regular normalization, e.g. (following the tutorial in the link you provided) you need to run PrepSCTIntegration and then pass on normalization.method = "SCT" to the function FindIntegrationAnchors. We have not implemented this part in our anchor-finding function. It's on our to-do list, though.

mass-a commented 3 years ago

We have added support for SCTransform in the latest update (v1.1.0).

Here's some sample code that can be used to apply SCT prior to integration:

ref.list <- SplitObject(data, split.by = "sample")
ref.list <- lapply(ref.list, FUN = SCTransform, variable.features.n = 800)
features.sct <- SelectIntegrationFeatures(ref.list, nfeatures = 500)
ref.list <- PrepSCTIntegration(ref.list, anchor.features = features.sct)
ref.list <- lapply(ref.list, FUN = RunPCA, features = features.sct)

Then use STACAS on the SCT assay:

anchors.sct <- FindAnchors.STACAS(ref.list, anchor.features=features.sct, 
                               normalization.method = "SCT")
anchors.sct.filtered <- FilterAnchors.STACAS(anchors.sct)
mySampleTree <- SampleTree.STACAS(anchors.sct.filtered)

And finally integrate with Seurat and the filtered anchor set:

ref.integrated.sct <- IntegrateData(anchorset=anchors.sct.filtered, dims=1:10, k.weight=50,
                       normalization.method = "SCT", sample.tree=mySampleTree, preserve.order=T)