satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

An issue in generating celltype.predictions using TransferData function #3934

Closed hwu34 closed 3 years ago

hwu34 commented 3 years ago

Hi,

first of all, thank you for developing this nice package. I am newer in scdata analysis, and by using Seurat, I also learn a lot.

I am doing some integration analysis of scATAC-seq and scRNA-seq data following the tutorial below, https://satijalab.org/seurat/v3.2/atacseq_integration_vignette.html

When I use the pbmc data applied in the tutorial, it works fine. when I use my own datasets, I get an error when running transferdata() function.

Here you can find my code and the error,

atac4C.seu <- CreateSeuratObject(counts = Pmat_sp4C, assay = "ATAC", project = "sanpATAC") atac4C.seu[["ACTIVITY"]] <- CreateAssayObject(counts = Gmat_sp4C)

DefaultAssay(atac4C.seu) <- "ACTIVITY" atac4C.seu <- FindVariableFeatures(atac4C.seu) atac4C.seu <- NormalizeData(atac4C.seu) atac4C.seu <- ScaleData(atac4C.seu)

DefaultAssay(atac4C.seu) <- "ATAC" VariableFeatures(atac4C.seu) <- names(which(Matrix::rowSums(atac4C.seu) > 100)) atac4C.seu <- RunLSI(atac4C.seu, n = 50, scale.max = NULL) atac4C.seu <- RunUMAP(atac4C.seu, reduction = "lsi", dims = 1:20)

transfer.anchors <- FindTransferAnchors(reference = Hep_202011.combined, query = atac4C.seu, features = VariableFeatures(object = Hep_202011.combined), reference.assay = "RNA", query.assay = "ACTIVITY", reduction = "cca") Running CCA Merging objects Finding neighborhoods Finding anchors Found 7571 anchors Filtering anchors Retained 1055 anchors Warning message: In RunCCA.Seurat(object1 = reference, object2 = query, features = features, : Running CCA on different assays_

celltype.predictions <- TransferData(anchorset = transfer.anchors, refdata = Hep_202011.combined$seurat_clusters, weight.reduction = atac4C.seu[["lsi"]])

Finding integration vectors Finding integration vector weights Error in nn2(data = c(-0.730157140737401, -2.70117139950626, -0.922065225244263, : NA/NaN/Inf in foreign function call (arg 2)

When I traceback(), I did find "NA" in the input file of do.call(what="nn2", args= args). but I can not figure it out how these values are calculated. Could you please help me solve the problem? Thanks in advance~!

timoast commented 3 years ago

Do you have NA, NaN, or Inf values in any of the input data matrices?

hwu34 commented 3 years ago

Thank you for the quick reply. That could be the reason, though I havent checked the input data yet. Because I did find that I used the wrong scATAC gene-peak matrix for generating the "activity" assay. So I reran the whole scripts, and the issue is not there anymore.