satijalab / seurat

R toolkit for single cell genomics
http://www.satijalab.org/seurat
Other
2.26k stars 908 forks source link

How can I read .h5ad file in r? #3414

Closed misterygirl2 closed 4 years ago

misterygirl2 commented 4 years ago

Hello I just tried to analyze seq file from .h5ad file. I got the file from www.kidneycellatlas.org website.........

kidney <- ReadH5AD("/home/Kidney/Mature_Immune_v2.1.h5ad", assay = "RNA", layers = "data" ,verbose =TRUE)

gave me an error...

Error: 'h5file' is not an exported object from 'namespace:hdf5r'

I don't know why this happens... can you help me to get this file???

mojaveazure commented 4 years ago

Hi,

We are transitioning our support for AnnData/H5AD files to SeuratDisk, our new package for interfacing Seurat objects with single-cell HDF5-based file formats. We would very much like it if you could give this a shot for reading in your data.

You can install SeuratDisk with the following:

if (!requireNamespace("remotes", quietly = TRUE)) {
  install.packages("remotes")
}
remotes::install_github("mojaveazure/seurat-disk")

A tutorial on how to read in AnnData/H5AD files via the h5Seurat intermediate can be found here. Greater detail about the new Convert mechanism can be found here

If you come across any bugs in reading in your HDF5 files, please post them in mojaveazure/seurat-disk#1. Please note, there are some stipulations about the format of your AnnData/H5AD posted in https://github.com/mojaveazure/seurat-disk/issues/1#issue-619215532

kanefos commented 3 years ago

If anyone else is reading this and doesn't quite understand the example on SeuratDisk:


library(SeuratDisk)

Convert("example_dir/example_ad.h5ad", ".h5seurat")
# This creates a copy of this .h5ad object reformatted into .h5seurat inside the example_dir directory

# This .d5seurat object can then be read in manually
seuratObject <- LoadH5Seurat("example_dir/example_ad.h5Seurat")
doublem69 commented 3 years ago

Used the SeuratDisk and converted that h5ad file from kidney cell atlas to an h5seurat file no problem. However, when I tried to use LoadH5Seurat, I received this error. Has anyone else faced a similar problem? Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : all(dims >= dims.min) is not TRUE

kanefos commented 3 years ago

@doublem69 sorry I can't answer your problem, I really bashed my head against this and it just could not get it to work. I know it seems a bit inelegant, but I personally recommend using numpy to export .npz files which you can then read into R as a matrix using reticulate to implement numpy into R . The obs/var metadata I just transition through a csv

Python

import numpy as np
# Out:
np.savez_compressed('matrix_file.npz', matrix)

# In:
npz = np.load("matrix_file.npz")
npz.files # shows files, i.e. 'arr_0'
matrix = npz['arr_0']

R

library(reticulate)
np <- import('numpy')

# Out:
np$savez("matrix_file.npz", matrix)

# In:
npz <- np$load('matrix_file.npz')
npz$files # shows files, i.e. 'arr_0'
matrix <- npz[['arr_0']]

Sorry for the unrequested suggestion, I just saw your comment about .h5ad and remember how long I spend on this.

117999 commented 3 years ago

@mojaveazure Hello, I am trying to install SeuratDisk with your code provided above. but it reports ERROR: dependency 'hdf5r' is not available for package 'SeuratDisk' I am wondering why this error would report and how to solve it Thanks a lot

lena-abc commented 3 years ago

Hello, I am having the following issue. When I try reading a h5file I get the following errr message:

RB_integrated <-LoadH5Seurat("RB_integrated.h5seurat") Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding miscellaneous information for RNA Initializing SCT with data Adding counts for SCT Adding scale.data for SCT Adding miscellaneous information for SCT Loading required package: SCTAssay Error in .requirePackage(package) : unable to find required package ‘SCTAssay’ In addition: Warning message: In library(package, lib.loc = lib.loc, character.only = TRUE, logical.return = TRUE, : there is no package called ‘SCTAssay’

Do you have any ideas what to do? SCTAssay is not a package so I do not know how to solve that...

zhanghao-njmu commented 2 years ago

You can also try to use the adata_to_srt function of SCP package (https://github.com/zhanghao-njmu/SCP)

library(SCP)
library(reticulate)
sc <- import("scanpy")
adata <- sc$read_h5ad("pancreas.h5ad")
srt <- adata_to_srt(adata)
srt
htejedam commented 2 years ago

I still get the same error :(

Error in py_convert_pandas_df(x) : INTEGER() can only be applied to a 'integer', not a 'double'

zhanghao-njmu commented 2 years ago

I still get the same error :(

Error in py_convert_pandas_df(x) : INTEGER() can only be applied to a 'integer', not a 'double'

Some bugs have been fixed. You can try again.

zy-fang commented 1 year ago

Hi, I met another error

IndexError: index 899108 is out of bounds for axis 0 with size 899108

When I get the metadata by using adata$obs, the same error.

levinhein commented 1 year ago

In my cases, I encocuntered a different error. Has anyone experienced this who perhaps have a solution?

library(SeuratDisk) Convert("DATA_aggregated_annotated.h5ad", ".h5seurat")

Warning: Unknown file type: h5ad Warning: 'assay' not set, setting to 'RNA' Creating h5Seurat file for version 3.1.5.9900 Adding X as scale.data Adding raw/X as data Adding raw/X as counts Adding meta.features from raw/var Adding features from scaled feature-level metadata Adding X_pca as cell embeddings for pca Adding X_umap as cell embeddings for umap Adding PCs as feature loadings fpr pca

SeuratObject <- LoadH5Seurat(".h5seurat") Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Error in match.arg(arg = layer, choices = Layers(object = object, search = FALSE)) : 'arg' should be one of “counts”, “data”, “scale.data”

liufymed commented 2 months ago

HI, @mojaveazure when running the Loadh5Seurat, I met another issue.

YS23Science <- LoadH5Seurat("/home/liufy/YS/23Science/ys_portal_object20221208.h5seurat") Validating h5Seurat file Warning: Feature names cannot have underscores (''), replacing with dashes ('-')Initializing RNA with data Adding counts for RNA Adding feature-level metadata for RNA Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Error: Missing required datasets 'levels' and 'values'

The dataset is from https://developmental.cellatlas.io/yolk-sac, it is a published dataset. How can I solve this? Many thanks!!!

kinnaryshah commented 2 months ago

I'm also troubleshooting the error from the last message in this thread.

"Error: Missing required datasets 'levels' and 'values'"