satijalab / azimuth

A Shiny web app for mapping datasets using Seurat v4
https://satijalab.org/azimuth
GNU General Public License v3.0
108 stars 31 forks source link

AzimuthReference fails when metadata field contains empty factor levels #208

Open erzakiev opened 8 months ago

erzakiev commented 8 months ago

I guess this is not a major issue, but I think it's an easy fix (just adding droplevels() inside the AzimuthReference) and would have avoided a big head scratcher for people like me who subset initial Seurat object for one reason or another

library(Seurat)
library(Azimuth)
library(SeuratData)
obj <- LoadData("pbmcsca")
obj <- SCTransform(obj, verbose = F)
obj <- RunPCA(obj, npcs = 50, verbose = F)
obj <- RunUMAP(obj, dims = 1:30, return.model = T, verbose = F)
obj <- FindNeighbors(obj, dims = 1:30, reduction = "pca", k.param = 31, verbose = F)

obj$CellType <- as.factor(obj$CellType)

# removing some cells in order to have 'empty' factor levels
obj <- subset(obj, cells = names(obj$CellType[!obj$CellType %in% 'Unassigned']))

# fails
obj_Azimuth <- AzimuthReference(
     object = obj,
     refUMAP = "umap",
     refDR = "pca",
     refAssay = "SCT",
     metadata = 'CellType',
     dims = 1:50,
     k.param = 31,
     reference.version = "1.0.0"
)
#> Only one graph name supplied, storing nearest-neighbor graph only
#> Error in ValidateAzimuthReference(object = object) : 
#>  The colormap stored in the AzimuthData object must contain a color-id mapping for every unique id present in the plotting ids.
#> In addition: Warning messages:
#> 1: Different cells and/or features from existing assay SCT 
#> 2: In sort(x = as.character(unique(x = plotids[[id]]))) == sort(x = names(x = colormap[[id]])) :
#> longer object length is not a multiple of shorter object length

# dropping the empty level
obj$CellType <- droplevels(obj$CellType)

# now works
obj_Azimuth <- AzimuthReference(
  object = obj,
  refUMAP = "umap",
  refDR = "pca",
  refAssay = "SCT",
  metadata = 'CellType',
  dims = 1:50,
  k.param = 31,
  reference.version = "1.0.0"
)
#> Only one graph name supplied, storing nearest-neighbor graph only
#> Warning message:
#> Different cells and/or features from existing assay SCT