mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
139 stars 44 forks source link

Support for AnnData/H5AD files #1

Open mojaveazure opened 4 years ago

mojaveazure commented 4 years ago

Tracker for bugs in the h5Seurat/H5AD converter. Please note:

sherifgerges commented 4 years ago

Hello, thank you so much for putting this up. I am trying to install, I get the following error. Any ideas as to whats wrong? Thanks so much

✓ checking for file ‘/private/var/folders/rf/yddlf5ss53968h_zpbkfvkkx1vsnpd/T/RtmpxoZWGw/remotescc9d622d28c/mojaveazure-seurat-disk-007a931/DESCRIPTION’ ... ─ preparing ‘SeuratDisk’: ✓ checking DESCRIPTION meta-information ... ─ checking for LF line-endings in source and make files and shell scripts ─ checking for empty or unneeded directories ─ building ‘SeuratDisk_0.0.0.9009.tar.gz’ Warning: invalid uid value replaced by that for user 'nobody' Warning: invalid gid value replaced by that for user 'nobody'

ohne416 commented 3 years ago

I tried to convert from a HD5ad file to h5seurat using convert function in seurat-disk package, but it failed with this error message "Error: Cannot find feature names in this H5AD file" THe H5ad files were downloaded from. https://www.covid19cellatlas.org/ Can you help?

jlu360a commented 3 years ago

Hi,

First thanks for developing this nice tool. It has been very helpful. I have a question here. I am not sure if this is expected. I am trying to read a h5ad file. The source is here:

Source: https://cellxgene.cziscience.com/ DataSet: "Krasnow Lab Human Lung Cell Atlas, 10X"

The h5ad file is around 735MB. I was successful in converting to 'h5seurat' format (with file size around 1.04GB). When I try to load with "LoadH5Seurat", I am kind of surprised that it takes around 30GB memory on my linux computer (Centos 7). Is this expected? Here is the version info:

Thanks.

Isabelle-C commented 3 years ago

Hi,

Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted:

test <- LoadH5Seurat(file = 'myfilename.h5seurat') Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Initializing scaled with data Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

seigfried commented 3 years ago

pbmc3k <- LoadH5Seurat("NPC_All_4Labelled.h5seurat") Validating h5Seurat file Warning: Feature names cannot have underscores (''), replacing with dashes ('-') Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Adding miscellaneous information Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

hbandukw commented 3 years ago

Hello,

I was able to successfully convert my integrated assay (with SCT used for normalization) into h5ad but I am unable to read it into scanpy.

Scanpy:

adata = sc.read_h5ad(Seurat_h5ad_path)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-9928f0c89d25> in <module>
----> 1 adata = sc.read_h5ad(Seurat_h5ad_path)

~/opt/anaconda3/envs/scenic_protocol/lib/python3.6/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    440     _clean_uns(d)  # backwards compat
    441 
--> 442     return AnnData(**d)
    443 
    444 

TypeError: __init__() got an unexpected keyword argument 'active.ident'

Output:

Creating h5Seurat file for version 3.1.5.9900
Adding counts for RNA
Adding data for RNA
No variable features found for RNA
No feature-level metadata found for RNA
Adding counts for SCT
Adding data for SCT
Adding scale.data for SCT
No variable features found for SCT
No feature-level metadata found for SCT
Writing out SCTModel.list for SCT
Adding data for integrated
Adding scale.data for integrated
Adding variable features for integrated
No feature-level metadata found for integrated
Writing out SCTModel.list for integrated
Adding cell embeddings for pca
Adding loadings for pca
No projected loadings for pca
Adding standard deviations for pca
No JackStraw data for pca
Adding cell embeddings for umap
No loadings for umap
No projected loadings for umap
No standard deviations for umap
No JackStraw data for umap
Validating h5Seurat file
Adding scale.data from integrated as X
Adding data from integrated as raw
Transfering meta.data to obs
Adding dimensional reduction information for pca
Adding feature loadings for pca
Adding dimensional reduction information for umap
Adding integrated_snn as neighbors
lavon79 commented 3 years ago

Hi,

Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted:

test <- LoadH5Seurat(file = 'myfilename.h5seurat') Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Initializing scaled with data Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent

I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

Have you sloved this issue? i meet the same error

dfernandezperez commented 3 years ago

I have also the same error with any Seurat object I try to convert.

fly4all commented 3 years ago

I was able to convert an .h5ad file from this dataset into .h5seurat, but I can't seem to load the file.

Upon running seuratObject <- LoadH5Seurat("~/Downloads/GSE161228_24h_PN_all.h5seurat")

I get the following error: Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction tsne Adding cell embeddings for tsne Adding miscellaneous information for tsne Adding command information Adding cell-level metadata Error: Too many values for levels provided

Do you have any advice on resolving this?

davidroad commented 2 years ago

pbmc3k <- LoadH5Seurat("NPC_All_4Labelled.h5seurat") Validating h5Seurat file Warning: Feature names cannot have underscores (''), replacing with dashes ('-') Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Adding miscellaneous information Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

pbmc3k <- LoadH5Seurat("pbmc3k_final.h5seurat",array = "RNA") will work

jmitchell81 commented 2 years ago

pbmc3k <- LoadH5Seurat("NPC_All_4Labelled.h5seurat") Validating h5Seurat file Warning: Feature names cannot have underscores (''), replacing with dashes ('-') Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Adding miscellaneous information Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

pbmc3k <- LoadH5Seurat("pbmc3k_final.h5seurat",array = "RNA") will work

This also worked for me, but use assays = "RNA" instead of array = "RNA"

GouQiao commented 2 years ago

Hi , I want to used convert function to convert h5ad to seurat. But I met the following error:

Error in self$write_low_level(value, file_space = self_space_id, mem_space = mem_space_id, : Number of objects in robj is not the same and not a multiple of number of elements selected in file: expected are 0 but provided are 3000

Is there anyone knows how to solve?

divyanshusrivastava commented 2 years ago

Hi. I am also facing issues while reading the (successfully converted) H5Seurat file. Here is the traceback

Validating h5Seurat file

Initializing RNA with data

Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : all(dims >= dims.min) is not TRUE Traceback:

  1. LoadH5Seurat("temp_pdx_adata.h5seurat")
  2. LoadH5Seurat.character("temp_pdx_adata.h5seurat")
  3. LoadH5Seurat(file = hfile, assays = assays, reductions = reductions, . graphs = graphs, neighbors = neighbors, images = images, . meta.data = meta.data, commands = commands, misc = misc, . tools = tools, verbose = verbose, ...)
  4. LoadH5Seurat.h5Seurat(file = hfile, assays = assays, reductions = reductions, . graphs = graphs, neighbors = neighbors, images = images, . meta.data = meta.data, commands = commands, misc = misc, . tools = tools, verbose = verbose, ...)
  5. as.Seurat(x = file, assays = assays, reductions = reductions, . graphs = graphs, neighbors = neighbors, images = images, . meta.data = meta.data, commands = commands, misc = misc, . tools = tools, verbose = verbose, ...)
  6. as.Seurat.h5Seurat(x = file, assays = assays, reductions = reductions, . graphs = graphs, neighbors = neighbors, images = images, . meta.data = meta.data, commands = commands, misc = misc, . tools = tools, verbose = verbose, ...)
  7. AssembleAssay(assay = assay, file = x, slots = assays[[assay]], . verbose = verbose)
  8. as.matrix(x = assay.group[["data"]])
  9. as.matrix.H5Group(x = assay.group[["data"]])
  10. as.sparse(x = x, ...)
  11. as.sparse.H5Group(x = x, ...)
  12. sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], . dims = h5attr(x = x, which = "dims"))
  13. stopifnot(all(dims >= dims.min))
giorgiatosoni commented 2 years ago

Hi, Thank you for developing the tool! I was able to convert my h5ad file to h5seurat. However, when I am reading the h5seurat file, the following error was resulted: test <- LoadH5Seurat(file = 'myfilename.h5seurat') Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Initializing scaled with data Error in dimnames(x) <- dn : length of 'dimnames' [1] not equal to array extent I tried on multiple files and they all result in the same error. Would you please let me know what went wrong? Thank you so much!

Have you sloved this issue? i meet the same error

Hi, did someone solve this issue??

ishwarvh commented 2 years ago

Hello,

I was able to successfully convert my integrated assay (with SCT used for normalization) into h5ad but I am unable to read it into scanpy.

Scanpy:

adata = sc.read_h5ad(Seurat_h5ad_path)

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-43-9928f0c89d25> in <module>
----> 1 adata = sc.read_h5ad(Seurat_h5ad_path)

~/opt/anaconda3/envs/scenic_protocol/lib/python3.6/site-packages/anndata/_io/h5ad.py in read_h5ad(filename, backed, as_sparse, as_sparse_fmt, chunk_size)
    440     _clean_uns(d)  # backwards compat
    441 
--> 442     return AnnData(**d)
    443 
    444 

TypeError: __init__() got an unexpected keyword argument 'active.ident'
  • Seurat:
  • R version 4.0.2 (2020-06-22)
  • Seurat_4.0.0
  • SeuratDisk_0.0.0.9018

Output:

Creating h5Seurat file for version 3.1.5.9900
Adding counts for RNA
Adding data for RNA
No variable features found for RNA
No feature-level metadata found for RNA
Adding counts for SCT
Adding data for SCT
Adding scale.data for SCT
No variable features found for SCT
No feature-level metadata found for SCT
Writing out SCTModel.list for SCT
Adding data for integrated
Adding scale.data for integrated
Adding variable features for integrated
No feature-level metadata found for integrated
Writing out SCTModel.list for integrated
Adding cell embeddings for pca
Adding loadings for pca
No projected loadings for pca
Adding standard deviations for pca
No JackStraw data for pca
Adding cell embeddings for umap
No loadings for umap
No projected loadings for umap
No standard deviations for umap
No JackStraw data for umap
Validating h5Seurat file
Adding scale.data from integrated as X
Adding data from integrated as raw
Transfering meta.data to obs
Adding dimensional reduction information for pca
Adding feature loadings for pca
Adding dimensional reduction information for umap
Adding integrated_snn as neighbors

Hello, Were you able to figure out issue here?

peralesvilchezl commented 1 year ago

Hi,

First thanks for developing this nice tool. It has been very helpful. I have a question here. I am not sure if this is expected. I am trying to read a h5ad file. The source is here:

Source: https://cellxgene.cziscience.com/ DataSet: "Krasnow Lab Human Lung Cell Atlas, 10X"

The h5ad file is around 735MB. I was successful in converting to 'h5seurat' format (with file size around 1.04GB). When I try to load with "LoadH5Seurat", I am kind of surprised that it takes around 30GB memory on my linux computer (Centos 7). Is this expected? Here is the version info:

  • R 4.0.3
  • SeuratDisk_0.0.0.9013
  • Seurat_3.2.2

Thanks.

Hey I have the same problem!!

xiao-kong-long commented 1 year ago

Hi, I wonder know how to process the spatial information of 10x Visium data, follows are my code :

In R :

seurat.object = Load10X_Spatial(data.dir = h5.dir, filename = filename)
SaveH5Seurat(seurat.object, filename = data.h5seurat.url)
Convert(data.h5seurat.url, dest = 'h5ad')

In Python : adata = sc.read(input_dir + '/test.h5ad')

transformed adata loses so much information, espacially for spatial position and image. I don't know any solution of this.

xiao-kong-long commented 1 year ago

67

DanielMedic commented 1 year ago

pbmc3k <- LoadH5Seurat("NPC_All_4Labelled.h5seurat") Validating h5Seurat file Warning: Feature names cannot have underscores (''), replacing with dashes ('-') Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding command information Adding cell-level metadata Adding miscellaneous information Error in if (!x[[i]]$dims) { : argument is of length zero

Not sure whether the first warning comes from. The h5Seurat object step works fine and then fails in the second step.

I'm having the exact same error.

maxjcarlino commented 1 year ago

Hello, Thank you for developing this tool! It is extremely valuable to my research in working with multiple collaborators. I was able to successfully convert an H5 Seurat object to an H5ad object, however for some reason I only obtain the top 2000 variable features in my converted object. The H5 Seurat object still contains all features, so it seems I am missing something in the Convert function. Could you help me figure out what I am missing to export all features instead of only the variable features?

LoadH5Seurat(paste(dataDir, "ssEpcam.h5Seurat",sep = "")) Validating h5Seurat file Initializing RNA with data Adding counts for RNA Adding scale.data for RNA Adding feature-level metadata for RNA Adding variable feature information for RNA Adding miscellaneous information for RNA Initializing prediction.score.State with data Adding counts for prediction.score.State Adding miscellaneous information for prediction.score.State Warning: Keys should be one or more alphanumeric characters followed by an underscore, setting key from prediction.score.State to predictionscoreState Adding reduction pca Adding cell embeddings for pca Adding feature loadings for pca Adding miscellaneous information for pca Adding reduction umap Adding cell embeddings for umap Adding miscellaneous information for umap Adding graph RNA_nn Adding graph RNA_snn Adding command information Adding cell-level metadata Adding miscellaneous information Adding tool-specific results An object of class Seurat 24448 features across 12089 samples within 2 assays Active assay: RNA (24393 features, 2000 variable features) 1 other assay present: prediction.score.State 2 dimensional reductions calculated: pca, umap Convert(paste(dataDir, "ssEpcam.h5Seurat",sep = ""), dest = "h5ad", assay = "RNA", overwrite=TRUE) Validating h5Seurat file Adding scale.data from RNA as X Transfering meta.features to var Adding data from RNA as raw Transfering meta.features to raw/var Transfering meta.data to obs Adding dimensional reduction information for pca Adding feature loadings for pca Adding dimensional reduction information for umap Adding RNA_snn as neighbors

And when I load the h5ad into scanpy I get:

anndata = scanpy.read_h5ad(os.path.join(chdir, 'data/ssEpcam.h5ad')) anndata AnnData object with n_obs × n_vars = 12089 × 2000 obs: 'orig.ident', 'nCount_RNA', 'nFeature_RNA', 'type', 'stage', 'embryo', 'percent.mt', 'RNA_snn_res.0.8', 'seurat_clusters', 'RNA_snn_res.0.5', 'RNA_snn_res.0.25', 'BC', 'sex', 'S.Score', 'G2M.Score', 'Phase', 'Kernel', 'predicted.State.score', 'State', 'CellCluster', 'nCount_prediction.score.State', 'nFeature_prediction.score.State', 'RNA_snn_res.0.2' var: 'vst.mean', 'vst.variance', 'vst.variance.expected', 'vst.variance.standardized', 'vst.variable' uns: 'neighbors' obsm: 'X_pca', 'X_umap' varm: 'PCs' obsp: 'distances'

Thank you in advance for your help!!

1098255342 commented 1 year ago

Have you sloved this issue? i meet the same error

maxjcarlino commented 1 year ago

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X Transfering meta.features to var Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

DM0815 commented 1 year ago

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X Transfering meta.features to var Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

how do I change your code 'data <- ScaleData(data, features = rownames(data))', if my seuratobject name is s.

1098255342 commented 1 year ago

Sorry,I didn't slove this issue

---Original--- From: @.> Date: Wed, Jun 21, 2023 11:16 AM To: @.>; Cc: @.**@.>; Subject: Re: [mojaveazure/seurat-disk] Support for AnnData/H5AD files (#1)

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X Transfering meta.features to var Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

how do I change your code 'data <- ScaleData(data, features = rownames(data))', if my seuratobject name is s.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

maxjcarlino commented 1 year ago

Have you sloved this issue? i meet the same error

Yes, the answer was that the convert function was trying to pull the scaled data and there was no argument we could find in the convert function that would change where it was pulling from, so it was always pulling the scaled 2000 variable genes only:

Adding scale.data from RNA as X Transfering meta.features to var Adding data from RNA as raw

The solution that worked was to rescale the object in R, which rescales all genes:

data <- ScaleData(data, features = rownames(data))

Once I did that then saved as H5 and converted, it exported the full gene list

how do I change your code 'data <- ScaleData(data, features = rownames(data))', if my seuratobject name is s.

just replace "data" with your seurat object name as labeled when you load it into the R workspace

Tingtingyang1234 commented 9 months ago

hello ,when i covert my h5ad file to h5seurat, there is an error : Warning: Unknown file type: h5ad Creating h5Seurat file for version 3.1.5.9900 Adding X as scale.data Adding raw/X as data Adding raw/X as counts Adding meta.features from raw/var Merging dispersions from scaled feature-level metadata Merging dispersions_norm from scaled feature-level metadata Merging feature_types from scaled feature-level metadata Error in source[["var"]][[mf]]$read() : 不适用于非函数

Can you help me?

karlie002 commented 9 months ago

Hi, when I tried to use LoadSeurat function for a .h5ad object , I got the error messages like : Validating h5Seurat file Initializing RNA with data Error in sparseMatrix(i = x[["indices"]][] + 1, p = x[["indptr"]][], x = x[["data"]][], : 'dims' must contain all (i,j) pairs Has anyone met the problem? Any suggestions will be appreciated !

madeofrats commented 7 months ago

Same here.