satijalab / azimuth

A Shiny web app for mapping datasets using Seurat v4
https://satijalab.org/azimuth
GNU General Public License v3.0
114 stars 32 forks source link

azimuth_analysis.R code for human-kidney error message #104

Closed qicheng-ma closed 2 years ago

qicheng-ma commented 2 years ago

Dear Azimuth team,

I go the following error message from azimuth_analysis.R code which I downloaded from https://app.azimuth.hubmapconsortium.org/app/human-kidney

Even though with " features = intersect(rownames(x = reference$map), VariableFeatures(object = query)),"

"Error in index$getNNsByVectorList(query[x, ], k, search.k, include.distance) : fv.size() != vector_size Calls: AddMetaData ... resolve.list -> signalConditionsASAP -> signalConditions Execution halted"

But I can run it though https://app.azimuth.hubmapconsortium.org/app/human-kidney, that is how I got the azimuth_analysis.R code.

I appreciate your help!

if (packageVersion(pkg = "Seurat") < package_version(x = "4.0.0")) {

  • stop("Mapping datasets requires Seurat v4 or higher.", call. = FALSE)
  • }

Ensure glmGamPoi is installed

if (!requireNamespace("glmGamPoi", quietly = TRUE)) {

  • BiocManager::install("glmGamPoi")
  • }
  • }

Ensure Azimuth is installed

if (packageVersion(pkg = "Azimuth") < package_version(x = "0.4.0")) {

  • stop("Please install azimuth - remotes::install_github('satijalab/azimuth')", call. = FALSE)
  • }

library(Seurat) Attaching SeuratObject library(Azimuth) Registered S3 method overwritten by 'SeuratDisk': method from as.sparse.H5Group Seurat Attaching shinyBS

Download the Azimuth reference and extract the archive

Load the reference

Change the file path based on where the reference is located on your system.

reference <- LoadReference(path = "/reference-data/human_kidney/")

reference <- LoadReference(path = "https://seurat.nygenome.org/azimuth/references/v1.0.0/human_pbmc")

reference <- LoadReference(path = "https://seurat.nygenome.org/azimuth/references/v1.0.0/human_kidney")

Load the query object for mapping

Change the file path based on where the query file is located on your system.

query <- LoadFileInput(path = "character(0)")

query <- LoadFileInput(path = args[3]) query <- ConvertGeneNames(

Calculate nCount_RNA and nFeature_RNA if the query does not

contain them already

if (!all(c("nCount_RNA", "nFeature_RNA") %in% c(colnames(x = query[[]])))) {

  • calcn <- as.data.frame(x = Seurat:::CalcN(object = query))
  • colnames(x = calcn) <- paste(
  • colnames(x = calcn),
  • "RNA",
  • sep = '_'
  • )
  • query <- AddMetaData(
  • object = query,
  • metadata = calcn
  • )
  • rm(calcn)
  • }

Calculate percent mitochondrial genes if the query contains genes

matching the regular expression "^MT-"

if (any(grepl(pattern = '^MT-', x = rownames(x = query)))) {

  • query <- PercentageFeatureSet(
  • object = query,
  • pattern = '^MT-',
  • col.name = 'percent.mt',
  • assay = "RNA"
  • )
  • }

Filter cells based on the thresholds for nCount_RNA and nFeature_RNA

you set in the app

cells.use <- query[["nCount_RNA", drop = TRUE]] <= 261023 &

  • query[["nCount_RNA", drop = TRUE]] >= 254 &
  • query[["nFeature_RNA", drop = TRUE]] <= 11796 &
  • query[["nFeature_RNA", drop = TRUE]] >= 203

If the query contains mitochondrial genes, filter cells based on the

thresholds for percent.mt you set in the app

if ("percent.mt" %in% c(colnames(x = query[[]]))) {

  • cells.use <- cells.use & (query[["percent.mt", drop = TRUE]] <= 31 &
  • query[["percent.mt", drop = TRUE]] >= 0)
  • }

Remove filtered cells from the query

query <- query[, cells.use]

Preprocess with SCTransform

query <- SCTransform(

  • object = query,
  • assay = "RNA",
  • new.assay.name = "refAssay",
  • residual.features = rownames(x = reference$map),
  • reference.SCT.model = reference$map[["refAssay"]]@SCTModel.list$refmodel,
  • method = 'glmGamPoi',
  • ncells = 2000,
  • n_genes = 2000,
  • do.correct.umi = FALSE,
  • do.scale = FALSE,
  • do.center = TRUE
  • ) Using reference SCTModel to calculate pearson residuals Determine variable features Setting min_variance to: -Inf Calculating residuals of type pearson for 2755 genes |============= | |================================ |=================================================== |======================================================== ==============| 100% |=================================== | |======================================================================| 100% Set default assay to refAssay

    Find anchors between query and reference

    anchors <- FindTransferAnchors(

  • reference = reference$map,
  • query = query,
  • k.filter = NA,
  • reference.neighbors = "refdr.annoy.neighbors",
  • reference.assay = "refAssay",
  • query.assay = "refAssay",
  • reference.reduction = "refDR",
  • normalization.method = "SCT",
  • features = intersect(rownames(x = reference$map), VariableFeatures(object = query)),
  • dims = 1:100,
  • n.trees = 20,
  • mapping.score.k = 100
  • ) Normalizing query using reference SCT model Projecting cell embeddings Finding query neighbors Finding neighborhoods Finding anchors Found 905 anchors

Transfer cell type labels and impute protein expression

#

Transferred labels are in metadata columns named "predicted.*"

The maximum prediction score is in a metadata column named "predicted.*.score"

The prediction scores for each class are in an assay named "prediction.score.*"

The imputed assay is named "impADT" if computed

refdata <- lapply(X = "annotation.l2", function(x) {

  • reference$map[[x, drop = TRUE]]
  • }) names(x = refdata) <- "annotation.l2" if (FALSE) {
  • refdata[["impADT"]] <- GetAssayData(
  • object = reference$map[['ADT']],
  • slot = 'data'
  • )
  • } query <- TransferData(
  • reference = reference$map,
  • query = query,
  • dims = 1:100,
  • anchorset = anchors,
  • refdata = refdata,
  • n.trees = 20,
  • store.weights = TRUE
  • ) Finding integration vectors Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Predicting cell labels Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from predictionscorea nnotation.l2 to predictionscoreannotationl2

Calculate the embeddings of the query data on the reference SPCA

query <- IntegrateEmbeddings(

  • anchorset = anchors,
  • reference = reference$map,
  • query = query,
  • reductions = "pcaproject",
  • reuse.weights.matrix = TRUE
  • )

Integrating dataset 2 with reference dataset Finding integration vectors Integrating data Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from integrateddr t o integrateddr_ Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from integrateddr t o integrateddr Warning: All keys should be one or more alphanumeric characters followed by an underscore '', setting key to integrated dr_

Calculate the query neighbors in the reference

with respect to the integrated embeddings

query[["query_ref.nn"]] <- FindNeighbors(

  • object = Embeddings(reference$map[["refDR"]]),
  • query = Embeddings(query[["integrated_dr"]]),
  • return.neighbor = TRUE,
  • l2.norm = TRUE
  • ) Computing nearest neighbors

The reference used in the app is downsampled compared to the reference on which

the UMAP model was computed. This step, using the helper function NNTransform,

corrects the Neighbors to account for the downsampling.

query <- Azimuth:::NNTransform(

  • object = query,
  • meta.data = reference$map[[]]
  • )

Project the query to the reference UMAP.

query[["proj.umap"]] <- RunUMAP(

  • object = query[["query_ref.nn"]],
  • reduction.model = reference$map[["refUMAP"]],
  • reduction.key = 'UMAP_'
  • ) Warning: The default method for RunUMAP has changed from calling Python UMAP via reticulate to the R-native UWOT using t he cosine metric To use Python UMAP via reticulate, set umap.method to 'umap-learn' and metric to 'correlation' This message will be shown once per session Running UMAP projection 02:12:52 Read 1020 rows 02:12:52 Processing block 1 of 1 02:12:52 Commencing smooth kNN distance calibration using 1 thread 02:12:52 Initializing by weighted average of neighbor coordinates using 1 thread 02:12:52 Commencing optimization for 67 epochs, with 20400 positive edges 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| 02:12:52 Finished Warning: No assay specified, setting assay as RNA by default. Warning message: In RunUMAP.default(object = neighborlist, reduction.model = reduction.model, : Number of neighbors between query and reference is not equal to the number of neighbros within reference

Calculate mapping score and add to metadata

query <- AddMetaData(

  • object = query,
  • metadata = MappingScore(anchors = anchors),
  • col.name = "mapping.score"
  • ) Projecting reference PCA onto query Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Projecting back the query cells into original PCA space Finding integration vector weights 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| **| Computing scores: Finding neighbors of original query cells Finding neighbors of transformed query cells _Error in index$getNNsByVectorList(query[x, ], k, search.k, include.distance) : fv.size() != vector_size Calls: AddMetaData ... resolve.list -> signalConditionsASAP -> signalConditions Execution halted_
qicheng-ma commented 2 years ago

fixed