satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.25k stars 904 forks source link

Batch effect for CITE-seq #8240

Closed mainyanghr closed 8 months ago

mainyanghr commented 8 months ago
Screenshot 2023-12-28 at 23 49 30

Dear everyone,

I am working with cite-seq from 13 different experiments. After merging (Harmony) all the 13 samples together, I run the WNN with different combination, dims.list = list(1:45, 1:25), I am still not able to integrate 2 samples correctly according to the WNN-UMAP.

If I look at the gene expression based UMAP, all the samples perfectly integrated without batch effects.

I believe that ADT signals from two samples are very noisy and messed up WNN_UMAP. I am wondering whether there are ways to get rid of noisy signals from these two samples? Or is there any way to further reduce the calculation weight during WNN steps.

Thank you all.

Best,

igrabski commented 8 months ago

Hi, if you think the ADT data for two samples is noisy due to technical factors, then it may be difficult to overcome this issue in integration. You could experiment with adjusting the ADT PC dimensions used in the WNN computations, or you could also try adjusting the relative degree of influence between RNA and ADT, as in @3785. However, these will have broader implications on how RNA and ADT information are used for computing the joint WNN reduction. An alternative approach would be to just compute WNN using the samples whose ADT you trust (e.g. excluding those two samples) and then project the remaining samples in.

mainyanghr commented 8 months ago

Thank you very much for the helpful suggestions. I think to project the remaining samples will be a good suggestions. Currently, I am computing by using this strategies. Please help me to check the code is correct?

`# run is reference sample with good ADT staining. run <- RunPCA(run, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") run <- RunUMAP(run, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") run <- FindNeighbors(run, dims = 1:30, reduction = "harmony") run <- FindClusters(run, resolution = 1, verbose = FALSE)

run <- FindMultiModalNeighbors(run, reduction.list = list("pca", "apca"), dims.list = list(1:20, 1:30), modality.weight.name = "RNA.weight") run <- RunUMAP(run, nn.name = "weighted.nn", reduction.name = "wnn.umap", reduction.key = "wnnUMAP_") run <- FindClusters(run, graph.name = "wsnn", algorithm = 3, resolution = 1, verbose = FALSE)

E12 is another sample with bad ADT staining and I want to integrate E12 into run.

E12 <- RunPCA(E12, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") E12 <- RunUMAP(E12, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") E12 <- FindNeighbors(E12, dims = 1:30, reduction = "harmony") E12 <- FindClusters(E12, resolution = 1, verbose = FALSE) `

Then I want to integrate by using Mapquery

` anchors <- FindTransferAnchors( reference = run, normalization.method = "SCT", reference.reduction = "pca", query = E12 )

E12 <- MapQuery( anchorset = anchors, reference = run, query = E12, refdata = list(run = "seurat_clusters"), reference.reduction = "pca", reduction.model = "wnn.umap") `

It did not work during Mapquery steps. Could you help me to improve that?

I am also wondering how should we take the WNN_UMAP reduction into consideration when we combine the"bad" E12 sample with "good" run samples.