satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.29k stars 915 forks source link

FindVariableFeatures Error: 'VST.default' is not implemented yet #7545

Open Kim-Gihyeon opened 1 year ago

Kim-Gihyeon commented 1 year ago

Dear, developers.

First of all, I really thank you for the nice tools.

Now, I am struggling with a new error I have never seen before while I am running FindVariableFeatures I have downloaded the matrix and metadata from GSE160269.

My detailed codes and error are shown below.

CODE:

setwd("/home/gihyeon/Pipelines/scRNA/GENA104/Public_data/ESCC_2021_Nat_comm")
dir.create('results',recursive = T)
options(stringsAsFactors = F,check.bounds = F)
options(future.globals.maxSize = 100000 * 1024^2) #1e10, 10000 * 1024^2 (10Gb), 5000 * 1024^2 (5Gb)
future::plan("multicore", workers = 20)

library(Seurat) ; options(Seurat.object.assay.version = "v5")
library(SeuratObject)
library(dplyr)
library(ggplot2)
library(magrittr)
library(gtools)
library(stringr)
library(Matrix)
library(patchwork)
library(data.table)
library(RColorBrewer)
library(ggpubr)
library(ggsci)
library(randomcoloR)
source("/home/gihyeon/Pipelines/scRNA/GENA104/Public_data/scType_function.R")

input_cd45_pos <- read.table("raw_data/GSE160269_CD45pos_UMIs.txt", header = T, stringsAsFactors = F, sep = " ", row.names = 1)

metadata_pos <- read.table("raw_data/GSE160269_CD45pos_cells.txt", header = T, stringsAsFactors = F, sep = " ", row.names = 1)

data.pos <- CreateSeuratObject(input_cd45_pos, min.cells = 3, min.features = 250, meta.data = metadata_pos)

data.pos[["percent.mt"]] <- PercentageFeatureSet(data.pos, pattern = "^MT-")
data.pos[["percent.Ribo"]] <- PercentageFeatureSet(data.pos, pattern = "^RP[SL]")

filtered_data <-subset(data.pos, nFeature_RNA < 10000 & nFeature_RNA > 200 &
                         percent.mt < 15)

split_data_2 <- NormalizeData(filtered_data)
split_data_2[["RNA"]] <- split(split_data_2[["RNA"]], f = split_data_2$sample)
split_data_2 <- FindVariableFeatures(split_data_2, verbose = F) ## error occurred ##

The Error Message

>   split_data_2 <- FindVariableFeatures(split_data_2, verbose = F)
Error: 'VST.default' is not implemented yet

However, split_data_2 <- FindVariableFeatures(split_data_2, verbose = F, selection.method = "disp") is working. Therefore, I think that the error is related to the VST function. Unfortunately, however, I could not find out any solution for it.

My Seurat version is 4.9.9.9050 and R version is 4.3.0.

Thank you. Best regards, Gihyeon

Rohit-Satyam commented 1 year ago

Hi @Kim-Gihyeon @mxposed @zeehio

Me too. I updated to new Seurat version and now I am facing this issue where running NormalizeData > FindVariableFeatures throws error

Finding variable features for layer counts
Error: 'VST.default' is not implemented yet

However, the error goes away, when I run NormalizeData > RunALRA >FindVariableFeatures , it disappears. Here is the traceback

10: stop(gettextf("'%s' is not implemented yet", as.character(sys.call(sys.parent())[[1L]])), 
        call. = FALSE)
9: .NotYetImplemented()
8: VST.default(data = object, nselect = nselect, verbose = verbose, 
       ...)
7: method(data = object, nselect = nselect, verbose = verbose, ...)
6: FindVariableFeatures.default(object = data, method = method, 
       nselect = nselect, span = span, clip = clip, verbose = verbose, 
       ...)
5: hvf.function(object = data, method = method, nselect = nselect, 
       span = span, clip = clip, verbose = verbose, ...)
4: FindVariableFeatures.StdAssay(object = object[[assay]], selection.method = selection.method, 
       loess.span = loess.span, clip.max = clip.max, mean.function = mean.function, 
       dispersion.function = dispersion.function, num.bin = num.bin, 
       binning.method = binning.method, nfeatures = nfeatures, nselect = nfeatures, 
       mean.cutoff = mean.cutoff, dispersion.cutoff = dispersion.cutoff, 
       verbose = verbose, ...)
3: FindVariableFeatures(object = object[[assay]], selection.method = selection.method, 
       loess.span = loess.span, clip.max = clip.max, mean.function = mean.function, 
       dispersion.function = dispersion.function, num.bin = num.bin, 
       binning.method = binning.method, nfeatures = nfeatures, nselect = nfeatures, 
       mean.cutoff = mean.cutoff, dispersion.cutoff = dispersion.cutoff, 
       verbose = verbose, ...)
2: FindVariableFeatures.Seurat(mca.seurat2, nfeatures = 2000, selection.method = "vst", 
       slot = "counts", assay = "RNA")
1: FindVariableFeatures(mca.seurat2, nfeatures = 2000, selection.method = "vst", 
       slot = "counts", assay = "RNA")

I think this is happening when we make seurat object from csv files only. I don't face similar issue with other samples. But when I use Malaria Cell Atlas Phenotype files and counts from CSV and then create seurat object, then I encounter this error.

dat.m = read.csv('pf10xIDC_counts.csv', header = TRUE ,row.names = 1)
mca.seurat <- CreateSeuratObject(counts = dat.m, project = "Malaria-Cell-Atlas")
mca.seurat[["percent.mt"]] <- PercentageFeatureSet(object = mca.seurat, pattern = "mal")
mca.seurat$Sample <- "MCA"
mca.seurat$batch <- "MCA"
mca.seurat$batch2 <- "MCA"
mca.pheno = read.csv("pf10xIDC_pheno.csv", row.names = 1)
mca.seurat@meta.data <- cbind(mca.seurat@meta.data,mca.pheno)
mca.seurat2 <- NormalizeData(mca.seurat) 
mca.seurat2 <- FindVariableFeatures(mca.seurat2,nfeatures = 2000,selection.method = "vst")
Rohit-Satyam commented 1 year ago

Found the workaround but I hope the developers will fix it and yes the new seurat version does not convert data.frame object to dcgMatrix automatically. Referencing the issue, that gave me the idea.

> class(mca.seurat2@assays$RNA$counts)
[1] "data.frame"
> class(pfd@assays$RNA$counts)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"

mca.seurat <- CreateSeuratObject(counts = Matrix::Matrix(as.matrix(dat.m),sparse = T), project = "Malaria-Cell-Atlas")
Kim-Gihyeon commented 1 year ago

Hi @Rohit-Satyam,

As you pointed out, I found that the class of count slot of my Seurat Object was data.frame instead of dgCMatrix. So, I tried your suggestion. input_cd45_pos <- Matrix::Matrix(as.matrix(input_cd45_pos), sparse = T)

The above code transformed the count slot to dbCMatrix from data.frame well, and this made the following steps run well.

Additionally, I'm thinking this error Error: 'VST.default' is not implemented yet is due to reading tables with sep = "," or sep = " " (space) or else some separators instead of sep = "\t".

It will be really good if the developers could clarify this problem.

I really thank @Rohit-Satyam, and thank developers in advance!

Best regards, Gihyeon

MasonDou commented 1 year ago

Thank you @Rohit-Satyam and @Kim-Gihyeon , I meet the same problem when I load a matrix in txt format. Your discussion helps me to figure the problem out. Hope the developers can fix it in the future.

Best regards, Mason

BastiDucreux commented 1 year ago

Found the workaround but I hope the developers will fix it and yes the new seurat version does not convert data.frame object to dcgMatrix automatically. Referencing the issue, that gave me the idea.

> class(mca.seurat2@assays$RNA$counts)
[1] "data.frame"
> class(pfd@assays$RNA$counts)
[1] "dgCMatrix"
attr(,"package")
[1] "Matrix"

mca.seurat <- CreateSeuratObject(counts = Matrix::Matrix(as.matrix(dat.m),sparse = T), project = "Malaria-Cell-Atlas")

I was facing the same issue. Solution by @Rohit-Satyam is great while waiting for developers to fix the problem.

Shikari666 commented 1 year ago