Closed fingeram closed 3 weeks ago
Some additional info about my data objects:
> raw.mat 33696 x 1328118 IterableMatrix object with class MatrixDir
Row names: Xkr4, Gm1992 ... ENSMUSG00000095041 Col names: AAACCCAAGCCTGAGA-97, AAACCCAGTCGTACAT-97 ... TTTGTTGTCTGCATGA-96
Data type: uint32_t Storage order: column major
Queued Operations:
> sobj An object of class Seurat 33696 features across 1191094 samples within 1 assay Active assay: RNA (33696 features, 2000 variable features) 4 layers present: counts.SC, counts.SN, data.SC, data.SN
> sobj.sketch An object of class Seurat 67392 features across 1191094 samples within 2 assays Active assay: sketch (33696 features, 2000 variable features) 5 layers present: counts.SC, counts.SN, data.SC, data.SN, scale.data 1 other assay present: RNA 1 dimensional reduction calculated: pca
> sobj.sketch@assays$sketch Assay (v5) data with 33696 features for 1e+05 cells Top 10 variable features: Mmp12, Igfbp5, Igkc, Nxph1, Kcnip4, Ighm, Grm8, Nrg1, Jchain, Siglech Layers: counts.SC, counts.SN, data.SC, data.SN, scale.data
closing this issue since it is duplicated with https://github.com/satijalab/seurat/issues/9301
Hi,
I am working with a very larger sc/sn RNA-Seq dataset. Starting from an h5ad file have used BPcells package to load data in-memory as follows:
`raw <- open_matrix_anndata_hdf5(path="/novo/projects/departments/compbio/sysbio/Projects/mouse_liver_models/single_cell_and_nuclei/concatenated.dir/concatenated.h5ad") #imports as data type float
raw <- convert_matrix_type(raw, type = "uint32_t") #must convert count matrix from type float (non-integer) to integer values
write_matrix_dir(mat = raw, dir = "/novo/projects/shared_projects/liver_biology_colab/people/aqnf/mouse_sc_sn_AQNF_June24/BPcells/mouse_counts")
raw.mat <- open_matrix_dir(dir = "/novo/projects/shared_projects/liver_biology_colab/people/aqnf/mouse_sc_sn_AQNF_June24/BPcells/mouse_counts")
sobj <- CreateSeuratObject(counts = raw.mat)
meta <- merge(x= metadata_BSCK, y= metadata_CPDM, by.x = "LibraryID", by.y = "library_id", all.y=T)
sobj<- AddMetaData(sobj, metadata = meta)`
I am working with seurat v5, so I am trying to split layers based on the perepartion method (single cell and single nuc seq). After that I am creating a sketch assay for my seurat object in-memory in order to run downstream analysis more efficiently (the dataset is to large for the available memory):
`sobj <- subset(sobj, subset = nCount_RNA < 50000 & nFeature_RNA > 250 & nFeature_RNA < 8000 & pct_ribo < 20)
sobj[["RNA"]] <- split(sobj[["RNA"]], f = sobj$group)
sobj <- NormalizeData(sobj)
sobj <- FindVariableFeatures(sobj)
sobj.sketch <- SketchData( object = sobj, ncells = 50000, method = "LeverageScore", sketched.assay = "sketch")
DefaultAssay(sobj.sketch) <- "sketch"`
Up to that point everything runs fine but then when I try to get started with the dimensionality reduction I am running into issues that I don't understand. It seems like something goes wrong when trying to RunPCA, as the Ellbow plot looks very weird and other steps of the pipeline relying on the pca, fail to run. I tried to trace the issue but have failed, so help is very welcome:
`sobj.sketch <- FindVariableFeatures(sobj.sketch)
sobj.sketch <- ScaleData(sobj.sketch)
sobj.sketch <- RunPCA(sobj.sketch)
sobj.sketch <- FindNeighbors(sobj.sketch, dims = 1:30)
Computing nearest neighbor graph Computing SNN Error: std::bad_alloc`