satijalab / azimuth

A Shiny web app for mapping datasets using Seurat v4
https://satijalab.org/azimuth
GNU General Public License v3.0
109 stars 31 forks source link

Error message in RunUMAP when I run it locally #96

Closed qicheng-ma closed 2 years ago

qicheng-ma commented 2 years ago

Dear Admin,

I got error message in RunUMAP when I run it locally, please help.

"Error in check_graph(graph, n_vertices, n_neighbors) : ncol(idx) == expected_cols is not TRUE Calls: RunUMAP ... RunUMAP.default -> -> check_graph -> stopifnot Execution halted"


ARGUMENT 'Seurat-Azimuth/Seurat-pbmc3k/pbmc_10k_v3_filtered_feature_bc_matrix.h5' ignored

R version 4.1.2 (2021-11-01) -- "Bird Hippie" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

Natural language support but running in an English locale

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

!/usr/bin/env Rscript

args <- commandArgs()

Ensure Seurat v4.0 or higher is installed

if (packageVersion(pkg = "Seurat") < package_version(x = "4.0.0")) {

  • stop("Mapping datasets requires Seurat v4 or higher.", call. = FALSE)
  • }

Ensure glmGamPoi is installed

if (!requireNamespace("glmGamPoi", quietly = TRUE)) {

  • if (!requireNamespace("BiocManager", quietly = TRUE)) {
  • BiocManager::install("glmGamPoi")
  • }
  • }

Ensure Azimuth is installed

if (packageVersion(pkg = "Azimuth") < package_version(x = "0.3.1")) {

  • stop("Please install azimuth - remotes::install_github('satijalab/azimuth')", call. = FALSE)
  • }

library(Seurat) Attaching SeuratObject library(SeuratDisk) Registered S3 method overwritten by 'cli': method from print.boxx spatstat.geom Registered S3 method overwritten by 'SeuratDisk': method from as.sparse.H5Group Seurat library(Azimuth) Attaching shinyBS

Download the Azimuth reference and extract the archive

Load the reference

Change the file path based on where the reference is located on your system.

reference <- LoadReference(path = "https://seurat.nygenome.org/azimuth/references/v1.0.0/human_pbmc")

Load the query object for mapping

Change the file path based on where the query file is located on your system.

query <- LoadFileInput(path = "character(0)")

query <- LoadFileInput(args[3])
Warning message: In sparseMatrix(i = indices[] + 1, p = indptr[], x = as.numeric(x = counts[]), : 'giveCsparse' has been deprecated; setting 'repr = "T"' for you

Calculate nCount_RNA and nFeature_RNA if the query does not

contain them already

if (!all(c("nCount_RNA", "nFeature_RNA") %in% c(colnames(x = query[[]])))) {

  • calcn <- as.data.frame(x = Seurat:::CalcN(object = query))
  • colnames(x = calcn) <- paste(
  • colnames(x = calcn),
  • "RNA",
  • sep = '_'
  • )
  • query <- AddMetaData(
  • object = query,
  • metadata = calcn
  • )
  • rm(calcn)
  • }

Calculate percent mitochondrial genes if the query contains genes

matching the regular expression "^MT-"

if (any(grepl(pattern = '^MT-', x = rownames(x = query)))) {

  • query <- PercentageFeatureSet(
  • object = query,
  • pattern = '^MT-',
  • col.name = 'percent.mt',
  • assay = "RNA"
  • )
  • }

Filter cells based on the thresholds for nCount_RNA and nFeature_RNA

you set in the app

cells.use <- query[["nCount_RNA", drop = TRUE]] <= 79534 &

  • query[["nCount_RNA", drop = TRUE]] >= 501 &
  • query[["nFeature_RNA", drop = TRUE]] <= 7211 &
  • query[["nFeature_RNA", drop = TRUE]] >= 54

If the query contains mitochondrial genes, filter cells based on the

thresholds for percent.mt you set in the app

if ("percent.mt" %in% c(colnames(x = query[[]]))) {

  • cells.use <- cells.use & (query[["percent.mt", drop = TRUE]] <= 97 &
  • query[["percent.mt", drop = TRUE]] >= 0)
  • }

Remove filtered cells from the query

query <- query[, cells.use]

Preprocess with SCTransform

query <- SCTransform(

  • object = query,
  • assay = "RNA",
  • new.assay.name = "refAssay",
  • residual.features = rownames(x = reference$map),
  • reference.SCT.model = reference$map[["refAssay"]]@SCTModel.list$refmodel,
  • method = 'glmGamPoi',
  • ncells = 2000,
  • n_genes = 2000,
  • do.correct.umi = FALSE,
  • do.scale = FALSE,
  • do.center = TRUE
  • ) Using reference SCTModel to calculate pearson residuals Determine variable features Calculating residuals of type pearson for 4999 genes |======= | |================== |============================ |====================================== |================================================= |============================================== |========================================= =============================| 100% |==================== | |================================================== |======================================================================| 100% Set default assay to refAssay

Find anchors between query and reference

anchors <- FindTransferAnchors(

  • reference = reference$map,
  • query = query,
  • k.filter = NA,
  • reference.neighbors = "refdr.annoy.neighbors",
  • reference.assay = "refAssay",
  • query.assay = "refAssay",
  • reference.reduction = "refDR",
  • normalization.method = "SCT",
  • features = intersect(rownames(x = reference$map), VariableFeatures(object = query)),
  • dims = 1:50,
  • n.trees = 20,
  • mapping.score.k = 100
  • ) Normalizing query using reference SCT model Projecting cell embeddings Finding query neighbors Finding neighborhoods Finding anchors Found 11291 anchors

Transfer cell type labels and impute protein expression

#

Transferred labels are in metadata columns named "predicted.*"

The maximum prediction score is in a metadata column named "predicted.*.score"

The prediction scores for each class are in an assay named "prediction.score.*"

The imputed assay is named "impADT" if computed

refdata <- lapply(X = "celltype.l2", function(x) {

  • reference$map[[x, drop = TRUE]]
  • }) names(x = refdata) <- "celltype.l2" if (TRUE) {
  • refdata[["impADT"]] <- GetAssayData(
  • object = reference$map[['ADT']],
  • slot = 'data'
  • )
  • } query <- TransferData(
  • reference = reference$map,
  • query = query,
  • dims = 1:50,
  • anchorset = anchors,
  • refdata = refdata,
  • n.trees = 20,
  • store.weights = TRUE
  • ) Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Predicting cell labels Warning: Feature names cannot have underscores (''), replacing with dashes ('-') Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from predictionscorec elltype.l2 to predictionscorecelltypel2_ Transfering 228 features onto reference data

Calculate the embeddings of the query data on the reference SPCA

query <- IntegrateEmbeddings(

  • anchorset = anchors,
  • reference = reference$map,
  • query = query,
  • reductions = "pcaproject",
  • reuse.weights.matrix = TRUE
  • ) Integrating dataset 2 with reference dataset Finding integration vectors Integrating data

Calculate the query neighbors in the reference

with respect to the integrated embeddings

query[["query_ref.nn"]] <- FindNeighbors(

  • object = Embeddings(reference$map[["refDR"]]),
  • query = Embeddings(query[["integrated_dr"]]),
  • return.neighbor = TRUE,
  • l2.norm = TRUE
  • ) Computing nearest neighbors

The reference used in the app is downsampled compared to the reference on which

the UMAP model was computed. This step, using the helper function NNTransform,

corrects the Neighbors to account for the downsampling.

query <- Azimuth:::NNTransform(

  • object = query,
  • meta.data = reference$map[[]]
  • )

Project the query to the reference UMAP.

query[["proj.umap"]] <- RunUMAP(

  • object = query[["query_ref.nn"]],
  • reduction.model = reference$map[["refUMAP"]],
  • reduction.key = 'UMAP_'
  • ) Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using t he cosine metric To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation' This message will be shown once per session Running UMAP projection Error in check_graph(graph, n_vertices, n_neighbors) : ncol(idx) == expected_cols is not TRUE Calls: RunUMAP ... RunUMAP.default -> -> check_graph -> stopifnot Execution halted
qicheng-ma commented 2 years ago

Solved.

kirstvh commented 2 years ago

Hi, I'm having the same errors, how did you solve this? Thank you!