Closed ghost closed 3 years ago
The function glmGamPoi:::handle_design_parameter
is having a problem - not sure why. Do your cells (columns in seur@assays$RNA@counts
) all add up to more than zero? Perhaps make sure that there are at least N genes detected in every cell (with N = 300 or so).
If you can share the raw counts,I can also have a look.
Thanks for your quick response! After long debugging, it turns out that the error happened because colnames(seur)
returned a vector of integers, instead of the cell names.
> head(colnames(object))
[1] "1" "2" "3" "4" "5" "6"
I then correctly set the column names of the Seurat object, and glmGamPoi ran just fine!
> head(colnames(object))
[1] "CN1_AAACCTGAGCTATGCT-1" "CN1_AAACCTGAGCTGCAAG-1" "CN1_AAACCTGAGGCCGAAT-1" "CN1_AAACCTGCAAGAGGCT-1" "CN1_AAACCTGCAATACGCT-1" [6] "CN1_AAACCTGCACACTGCG-1"
In the SCTransform function, you could consider giving a warning when the colnames of the Seurat object consist only of integers.
The glmGamPoi package already has a more descriptive error message for this problem in the works, but I don't think it's part of the stable release yet. See issue
I cannot reproduce the problem on my end:
set.seed(42)
tmp_s <- CreateSeuratObject(counts = counts) %>% SCTransform(method = 'glmGamPoi')
and
colnames(counts) <- 1:ncol(counts)
set.seed(42)
tmp_s2 <- CreateSeuratObject(counts = counts) %>% SCTransform(method = 'glmGamPoi')
both finish and give identical results (scaled data matrix). Here counts
is a count matrix that I happened to have loaded in my workspace.
So whatever was going on, I don't think it was a problem of having column (cell) names in the form of c('1', '2', '3', ...)
in your input. Without a reproducible example we may never know...
Using sctransform 0.3.2 and Seurat 4.0.4, I ran the following code:
seur = SCTransform(seur, vars.to.regress = c("percent.mt"), conserve.memory = T, method = "glmGamPoi", verbose = T)
and got the following error:
Calculating cell attributes from input UMI matrix: log_umi Variance stabilizing transformation of count matrix of size 24670 by 226635 Model formula is y ~ log_umi Get Negative Binomial regression parameters per gene Using 2000 genes, 5000 cells | |0% Error in handle_design_parameter(design, data, col_data, reference_level) : Number of rows in col_data does not match number of columns of data. Were there maybe 'NA's in the colData?
Here is some information about my Seurat object 'seur':
Both seur@assays$RNA@counts and seur@assays$RNA@data contain the same raw counts, and both are Matrix::dgCMatrix
Any help or guidance would be really appreciated!