mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
156 stars 50 forks source link

Problem with LoadH5Seurat #141

Open ydai3 opened 1 year ago

ydai3 commented 1 year ago

I want to convert an h5ad project to Seurat project, with the following command in R:

Convert("program_5.ovarian_doublet96.cell_filtering.h5ad", dest = "h5seurat",overwrite = T, verbose = T) ov<-LoadH5Seurat("program_5.ovarian_doublet96.cell_filtering.h5seurat")

The first step works well. But in the second step, an error was reported:

Validating h5Seurat file Initializing RNA with data Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : 'dims' must contain all (i,j) pairs

Can anyone help me figure out where is wrong? Thank you so much!

pvalle6 commented 1 year ago

One of the members of my team had the same issue. I am looking at fixing this.

Leticia314 commented 1 year ago

I am facing the same issue. Has this been solved?

dhanusha2504 commented 1 year ago

I am also facing the same issue. Has this been solved?

sergio-rutella commented 1 year ago

I am facing the same issue. Can you please advise?

ydai3 commented 1 year ago

Hi,

Thank you for reaching out. But currently I have not found a solution yet, so I worked on the h5ad object directly. Sincerely hope to get advice on it if you figure it out later. Thank you!

coschoi commented 1 year ago

Has anyone solved this issue? I still see this problem every now and then.

Sakura1a2a3a commented 1 year ago

I think I know what is going wrong. The h5ad sparse count matrix must be compressed sparse Row matrix instead of Col matrix. Be aware of the difference between csc_matrix and csr_matrix.

karlie002 commented 1 year ago

Waiting for a solution , have met the same issue .

madeofrats commented 1 year ago

Same here!

Sycholi commented 11 months ago

same

aaronbwong commented 11 months ago

I have the same issue trying to load a h5seurat file converted from h5ad. In the h5ad format the sparse matrix in CSR format, as suggested by a previous comment (https://github.com/mojaveazure/seurat-disk/issues/141#issuecomment-1738215646).

ThomasBeder commented 10 months ago

For anyone still trying, this is the solution that did the trick for me, however without using LoadH5Seurat...


#install.packages("reticulate")
library(reticulate)

# set pat to python (needs scanpy so in terminal install via pip: "python -m pip install scanpy")
path_to_python <- "/home/tbeder/miniconda3/bin/python3"
use_python(path_to_python)

# import scanpy
sc <- import("scanpy")
# read h5ad```
adata <- sc$read_h5ad("Galaxy20-[BoneMarrow(10X_and_Rhapsody)].h5ad")
# install zellkonverter
remotes::install_github("theislab/zellkonverter")

library(zellkonverter)
SCE <- AnnData2SCE(
  adata,
  X_name = "counts"
)
dim(SCE@assays@data$counts)

# get counts from single cell experiment
counts <- SCE@assays@data$counts

# create the seurat object
library(Seurat)
object <- CreateSeuratObject(counts = counts, assay = "RNA", min.cells = 10)
CatCatLiang commented 6 months ago

For anyone who still figuring out reason and solutions, especially anyone who want to keep all meta data. I met an exactly same error and now I have solved it by using loom method. The reason behind this error was that either or both of "row_attrs/Gene" and "col_attrs/CellID" of your h5ad file were stored as other names like "var_names" or other things, which can not be directly read by Seuratdisk or loomR packages. Here I recommend open your h5ad file in the form of adata, because it can check whether it has been downloaded completely. Then, write this adata into a loom file and check ra.keys and ca.keys of this loom file. so we can change names to "Gene" and "CellID". Finally, we can access it in R by directly load it. Here I attached part of my python code to check names and how I renamed them.

with loompy.connect(loom_path) as ds:
    print("Row attributes available:", ds.ra.keys())
    print("Column attributes available:", ds.ca.keys())

def inspect_and_fix_loom(loom_file):
    with loompy.connect(loom_file, 'r+') as ds:
            ds.ra['Gene'] = ds.ra['var_names'] # here replace 'var_names' with your own gene name variable
            ds.ca['CellID'] = ds.ca['obs_names'] # here replace 'obs_names' with your own cell ID variable

inspect_and_fix_loom(loom_path)
h4rvey-g commented 5 months ago

A bit dirty but simple way to solve this is to switch to the package zellkonverter. It will automatically create and use a python environment to read h5ad file, but eventually you will get a SingleCellExperiment object in R.

sce <- zellkonverter::readH5AD(h5ad_file, verbose = TRUE)
sce <- sce %>% scuttle::logNormCounts()
# convert it to Seurat object if you want
sce_seurat <- Seurat::as.Seurat(sce)
sergio-rutella commented 5 months ago

I have also used the following code, which works beautifully.

seurat = sceasy::convertFormat("XXX.h5ad", from = "anndata", to = "seurat")

Zackaly commented 5 months ago

I think I know what is going wrong. The h5ad sparse count matrix must be compressed sparse Row matrix instead of Col matrix. Be aware of the difference between csc_matrix and csr_matrix.

It's the correct answer

lutrarutra commented 4 months ago

Seems that the problem for me is indeed the csc_matrix. Fixed with adata.X = scipy.sparse.csr_matrix(adata.X)