bnprks / BPCells

Scaling Single Cell Analysis to Millions of Cells
https://bnprks.github.io/BPCells
Other
166 stars 17 forks source link

BPcells cannot load multiple assay layers from h5ad file? #153

Closed fingeram closed 2 weeks ago

fingeram commented 3 weeks ago

Hi,

I have an anndata.h5ad file which contains two assay layers: 1) anndata.X layer is the log1p normalized expression and 2) anndata object also has raw counts anndata.layers["counts"]. However, wenn I follow the BPCells tutorial to write the count matrix on disk and load it as seurat object, the seurat object only contains a single RNA assay slot. This assay slot seems to be "counts" by default but these counts are actually the log1p normalized counts from the h5ad file. Is there a way to write/load both, the raw counts and the normalized counts?

Thank you!!

bnprks commented 3 weeks ago

Hi @fingeram, BPCells matrix objects only operate on one layer at a time, but you are free to do multiple read operations to access multiple layers. The open_matrix_anndata_hdf5() function takes a group argument where you can specify which matrix layer you want to load out of an anndata file. By default we load from the main X layer.

So, you could run something like this for example:

counts_mat <- open_matrix_anndata_hdf5("myfile.h5ad", group="layers/counts")
norm_mat <- open_matrix_anndata_hdf5("myfile.h5ad", group="X")

I'm pretty sure if you wanted to get those both available in a Seurat object using the standard meanings of its data and counts slot, then what you could do is:

proj <- CreateSeuratObject(counts_mat)
proj[["RNA"]]$data <- norm_mat

BPCells and Seurat are both capable of re-calculating the normalized matrix from the counts if you prefer, but this is an example of how to do it if you want to take both layers from the file on disk. Hope that helps!

-Ben

fingeram commented 2 weeks ago

Dear @bnprks,

perfect, thank you so much for pointing this out! Works perfectly :)

Best, Anna