satijalab / sctransform

R package for modeling single cell UMI expression data using regularized negative binomial regression
GNU General Public License v3.0
203 stars 33 forks source link

Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg, : cell attribute row names must match column names of count matrix #131

Closed AlonsoOga closed 2 years ago

AlonsoOga commented 2 years ago

Hi,

I am demultiplexing one 10X library and everything runs smoothly until I get to the SCTransform function for the singlets object. I keep getting this error:

Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg, : cell attribute row names must match column names of count matrix

Any ideas as to what could it be happening?

saketkc commented 2 years ago

Can you post your full code here?

pbpayal commented 2 years ago

I have same problem. I am following https://github.com/satijalab/seurat-wrappers/blob/master/docs/velocity.md to get RNA velocity.

ldat <- ReadVelocity(file = "~/../sample.loom")
bm <- as.Seurat(x = ldat)
bm <- SCTransform(object = bm, assay = "spliced")

At SCTranform step, I get same error

Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg,  : 
  cell attribute "log_umi" contains NA, NaN, or infinite value

Can I use "NormalizeData" instead of SCTransform?

bm <- NormalizeData(bm, verbose = FALSE)
bm <- FindVariableFeatures(bm, selection.method = "vst", nfeatures = 2000)
saketkc commented 2 years ago

This is likely due to the count matrix having cells with zero counts. Sctransform assumes that all cells has at least one UMI - you can check if any of the cells is empty - colSums(counts) and then filter it before proceeding with sctransform.

LooLipin commented 1 year ago

Hi @saketkc ,

I am running into the same error while analyzing spatial transcriptomics data. I'm not sure if I can actually filter data from my matrix as I assume each of these spots need to remain for alignment.

The lack of UMI in my case is true as I have multiple small sections per capture area and the areas without tissue would have no UMI. Can you please advise on what I can do to overcome this issue?

Thanks, Lipin

saketkc commented 1 year ago

Hi @LooLipin, can you create a new issue and provide a reproducible example (with a subset of your dataset) if possible?

ronfinn commented 1 year ago

Hi there I am running into to the same issue and I am following the joint RNA and ATAC analysis from the Signac site on PBMCs. https://stuartlab.org/signac/articles/pbmc_multiomic.html

I get the following:

create a new assay using the MACS2 peak set and add it to the Seurat object pbmc[["peaks"]] <- CreateChromatinAssay( counts = macs2_counts, fragments = fragpath, annotation = annotation ) Computing hash Checking for 11909 cell barcodes

DefaultAssay(pbmc) <- "RNA" pbmc <- SCTransform(pbmc) Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg, : cell attribute row names must match column names of count matrix

paulitikka commented 1 year ago

Hi, As suggested by 'saketkc', filter as per the colsums, i.e. vari=vari[,unname(which(colSums(GetAssayData(vari))!=0))] where vari is a seurat object.

jawinks commented 8 months ago

Hi, As suggested by 'saketkc', filter as per the colsums, i.e. vari=vari[,unname(which(colSums(GetAssayData(vari))!=0))] where vari is a seurat object.

I tried this and got the following: Warning message: In GetAssayData.StdAssay(object = object[[assay]], layer = layer) : data layer is not found and counts layer is used

sisterdot commented 8 months ago

hey hey,

also getting the same error with Signac processing and SCTransform under specific circumstances ...

here is a reproducible example

library(Signac)
library(Seurat)
library(EnsDb.Hsapiens.v86)
library(BSgenome.Hsapiens.UCSC.hg38)

system("wget https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5")
system("wget https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz")
system("wget https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi")

counts <- Read10X_h5("pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5")
fragpath <- "pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz"

annotation <- GetGRangesFromEnsDb(ensdb = EnsDb.Hsapiens.v86)
seqlevels(annotation) <- paste0('chr', seqlevels(annotation))

pbmc <- CreateSeuratObject(
  counts = counts$`Gene Expression`,
  assay = "RNA"
)

pbmc <- SCTransform(pbmc)
#...Calculating cell attributes from input UMI matrix: log_umi

frags <- CreateFragmentObject(
    path = fragpath,
    cells = colnames(pbmc)
)
fcounts <- FeatureMatrix(
    fragments = frags,
    features = StringToGRanges(rownames(counts$Peaks),sep = c(":", "-")),
    cells = colnames(pbmc)
)

pbmc <- SCTransform(pbmc)
Error in make_cell_attr(umi, cell_attr, latent_var, batch_var, latent_var_nonreg,  : 
  cell attribute row names must match column names of count matrix

without the FeatureMatrix step there is no mistake...

one difference between the pbmc object before and after running FeatureMatrix (where i would have naively thought we did not even touch the pbmc object) is the addition of some hash tables...

 str(colnames(pbmcPreFeatureMatrix))
 chr [1:11909] "AAACAGCCAAGGAATC-1" "AAACAGCCAATCCCTT-1" ...

str(colnames(pmbcPostFeatureMatrix))
 chr [1:11909] "AAACAGCCAAGGAATC-1" "AAACAGCCAATCCCTT-1" ...
 - attr(*, ".match.hash")=Class 'match.hash'  

skipping the FeautureMatrix step all is fine

pbmc <- CreateSeuratObject(
  counts = counts$`Gene Expression`,
  assay = "RNA"
)
pbmc[["ATAC"]] <- CreateChromatinAssay(
  counts = counts$Peaks,
  sep = c(":", "-"),
  fragments = fragpath,
  annotation = annotation
)
pbmc <- SCTransform(pbmc)

str(colnames(pbmc))
 chr [1:11909] "AAACAGCCAAGGAATC-1" "AAACAGCCAATCCCTT-1" ...
R version 4.2.2 (2022-10-31)
Signac_1.9.0
Seurat_4.3.0
sctransform_0.3.5
fastmatch_1.1-3

the problem is easily resolved for me if i just call CreateSeuratObject after calling FeatureMatrix...

thought might be useful to report about this- in case someone else gets the error in a similar context... :-)