satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.25k stars 904 forks source link

Project "bad" samples to good sample derived wnn_UMAP. #8310

Closed mainyanghr closed 8 months ago

mainyanghr commented 8 months ago
image

Dear everyone,

I am working with cite-seq from 13 different experiments. After merging (Harmony) all the 13 samples together, I run the WNN with different combination, dims.list = list(1:45, 1:25), I am still not able to integrate 2 samples correctly according to the WNN-UMAP.

If I look at the gene expression based UMAP, all the samples perfectly integrated without batch effects.

I believe that ADT signals from two samples are very noisy and messed up WNN_UMAP. I am wondering whether there are ways to get rid of noisy signals from these two samples? Or is there any way to further reduce the calculation weight during WNN steps.

I think to project the bad samples into good samples. Currently, I am computing by using this strategies. Please help me to check the code is correct?

`# run is reference sample with good ADT staining. run <- RunPCA(run, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") run <- RunUMAP(run, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") run <- FindNeighbors(run, dims = 1:30, reduction = "harmony") run <- FindClusters(run, resolution = 1, verbose = FALSE)

run <- FindMultiModalNeighbors(run, reduction.list = list("pca", "apca"), dims.list = list(1:20, 1:30), modality.weight.name = "RNA.weight") run <- RunUMAP(run, nn.name = "weighted.nn", reduction.name = "wnn.umap", reduction.key = "wnnUMAP_") run <- FindClusters(run, graph.name = "wsnn", algorithm = 3, resolution = 1, verbose = FALSE)

E12 is another sample with bad ADT staining and I want to integrate E12 into run. E12 <- RunPCA(E12, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") E12 <- RunUMAP(E12, dims = 1:30, reduction = "harmony", reduction.name = "harmony_UMAP") E12 <- FindNeighbors(E12, dims = 1:30, reduction = "harmony") E12 <- FindClusters(E12, resolution = 1, verbose = FALSE) `

Then I want to integrate by using Mapquery ` anchors <- FindTransferAnchors( reference = run, normalization.method = "SCT", reference.reduction = "pca", query = E12 )

E12 <- MapQuery( anchorset = anchors, reference = run, query = E12, refdata = list(run = "seurat_clusters"), reference.reduction = "pca", reduction.model = "wnn.umap") `

It did not work during Mapquery steps. Could you help me to improve that?

I am also wondering how should we take the WNN_UMAP reduction into consideration when we combine the"bad" E12 sample with "good" run samples.

Thank you all.

Best,

igrabski commented 8 months ago

Hi, can you please clarify what you mean when you say it didn't work during the MapQuery step? Do you receive an error and if so, what is the error message? Or are the results different from what would you expect, and how do they look?

Regarding your second question, the interpretation will depend a bit on exactly how noisy/bad the ADT data is for those outlier samples. If they are just a little too noisy to use for determining the shared reduction, but still contain enough coherent information, then they may map appropriately to the WNN_UMAP. Conversely, if they are very noisy and contain almost no usable information, then it would be difficult to interpret their mapping.

mainyanghr commented 8 months ago

Hi, can you please clarify what you mean when you say it didn't work during the MapQuery step? Do you receive an error and if so, what is the error message? Or are the results different from what would you expect, and how do they look?

Regarding your second question, the interpretation will depend a bit on exactly how noisy/bad the ADT data is for those outlier samples. If they are just a little too noisy to use for determining the shared reduction, but still contain enough coherent information, then they may map appropriately to the WNN_UMAP. Conversely, if they are very noisy and contain almost no usable information, then it would be difficult to interpret their mapping.

Thank you for your reply: Background information; "run" is good samples, "E12" are bad samples with noisy ADT information. Aim: I plan to project bad sample "E12" to good samples "run" Strategies; I first made the wnn_UMAP based on the samples"run" and all the sub samples integrated well. Then I try to use MapQuery to project/integrate the bad samples E12 into good samples "run". But I do not how how to do that? Could you supply any code? Maybe my following strategies are totally wrong.

I have been trying in this way:

Find the anchors

`DefaultAssay(run) <- 'SCT'

anchors <- FindTransferAnchors( reference = run, normalization.method = "SCT", reference.reduction = "pca", query = E12 ) `

integrate the bad samples E12 to good one run:

E12 <- MapQuery( anchorset = anchors, reference = run, query = E12, refdata = list(run = "seurat_clusters"), reference.reduction = "pca", reduction.model = "wnn.umap")

Then I received the errors: Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Predicting cell labels | | 0 % ~calculating
Integrating dataset 2 with reference dataset Finding integration vectors Integrating data |++++++++++++++++++++++++++++++++++++++++++++++++++| 100% elapsed=01s
Error: The provided reduction.model does not have a model stored. Please try running umot-learn on the object first

igrabski commented 8 months ago

In this step, you should add return.model = TRUE, i.e.:

 run <- RunUMAP(run, nn.name = "weighted.nn", reduction.name = "wnn.umap", reduction.key = "wnnUMAP_",return.model=TRUE)

I will close the issue for now, but if this doesn't fix your problem, please re-open the issue!