mojaveazure / seurat-disk

Interfaces for HDF5-based Single Cell File Formats
https://mojaveazure.github.io/seurat-disk
GNU General Public License v3.0
140 stars 44 forks source link

Error: Missing required datasets 'levels' and 'values' #109

Open Diennguyen8290 opened 2 years ago

Diennguyen8290 commented 2 years ago

Hi,

Thanks for developing this great tool.

I'm running into an error in LoadH5Seurat() step, stated that: Error: Missing required datasets 'levels' and 'values'.

My data was downloaded from here: https://drive.google.com/file/d/1IwWcn4W-YKgNbz4DpNweM2cKxlx1hbM0/view

My scripts:

Convert("COVID19_ALL.h5ad", dest = "h5seurat",overwrite = T, verbose = TRUE) data <- LoadH5Seurat("COVID19_ALL.h5seurat").

Please could you have me have a look.

Many thanks.

Regards, Dien

JBreunig commented 2 years ago

Same issue here even with a pretty threadbare adata object.

pbmc <- LoadH5Seurat("/NatMmergeHarmonyRT3.h5seurat")
Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Adding command information
Adding cell-level metadata
Error: Missing required datasets 'levels' and 'values'
JBreunig commented 2 years ago

Thinking this may be due to recent updates to anndata as my postdoc is able to convert his adata objects back and forth without issue but couldn't convert my h5ad file.

clc37 commented 2 years ago

same issue, please help!What can we do, but wait?

edtim8 commented 2 years ago

same issue.............

kmh005 commented 2 years ago

I'm having the same issue converting an anndata h5ad that came from version 0.7.8. @JBreunig what version of anndata is your postdoc using?

rockdeme commented 2 years ago

Same issue here (anndata 0.8.0, scanpy 1.9.1). If I download the h5ad file used in vignettes/convert-anndata.Rmd it works. If I read it with scanpy and write it back to h5ad I get the same error.

To reproduce:

This works fine: R

library(Seurat)
library(SeuratDisk)

url <- "https://seurat.nygenome.org/pbmc3k_final.h5ad"
curl::curl_download(url, basename(url))

Convert("pbmc3k_final.h5ad", dest = "h5seurat", overwrite = TRUE)
pbmc3k <- LoadH5Seurat("pbmc3k_final.h5seurat")

If I read the same file with scanpy Python

in[1]: import scanpy as sc

in[2]: cells = sc.read_h5ad('pbmc3k_final.h5ad')

C:\Users\me\anaconda3\envs\BearOmics\lib\site-packages\anndata\compat\__init__.py:232: FutureWarning: Moving element from .uns['neighbors']['distances'] to .obsp['distances'].
This is where adjacency matrices should go now.
  warn(
C:\Users\me\anaconda3\envs\BearOmics\lib\site-packages\anndata\compat\__init__.py:232: FutureWarning: Moving element from .uns['neighbors']['connectivities'] to .obsp['connectivities'].
This is where adjacency matrices should go now.
  warn(

in[3]: cells.write('from_p.h5ad')

R

> Convert("from_p.h5ad", dest = "h5seurat", overwrite = TRUE)

Warning: Unknown file type: h5ad
Warning: 'assay' not set, setting to 'RNA'
Creating h5Seurat file for version 3.1.5.9900
Adding X as scale.data
Adding raw/X as data
Adding raw/X as counts
Adding meta.features from raw/var
Adding dispersions from scaled feature-level metadata
Adding dispersions_norm from scaled feature-level metadata
Merging gene_ids from scaled feature-level metadata
Adding highly_variable from scaled feature-level metadata
Adding means from scaled feature-level metadata
Merging n_cells from scaled feature-level metadata
Adding X_pca as cell embeddings for pca
Adding X_umap as cell embeddings for umap
Adding PCs as feature loadings fpr pca
Adding miscellaneous information for pca
Adding standard deviations for pca
Adding miscellaneous information for umap
Adding leiden to miscellaneous data

> pbmc3k <- LoadH5Seurat("from_p.h5seurat")

Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding scale.data for RNA
Adding feature-level metadata for RNA
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata

Error: Missing required datasets 'levels' and 'values'
Show stack trace
JBreunig commented 2 years ago

I'm having the same issue converting an anndata h5ad that came from version 0.7.8. @JBreunig what version of anndata is your postdoc using?

0.7.6 and 1.8.1

tomthun commented 2 years ago

I also get the Error: Missing required datasets 'levels' and 'values' when adding cell-level metadata whilst using LoadH5Seurat(). I am using the latest version of SeuratDisk_0.0.0.9020 and anndata 0.8.0.

Did anyone manage to solve this issue?

michaeleekk commented 2 years ago

I got the same issue. I saw everyone mentioning anndata version and so I tried 0.7.5 and seems to be no issue for now.

gleb-gavrish commented 2 years ago

Ok, I had the same issue, but managed to load the file. My solution was just adding two "FALSE" to some flags:

my_obj <- LoadH5Seurat("my_obj.h5seurat", meta.data = FALSE, misc = FALSE)

The initial error was this:

Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing combat_corrected with data
Adding counts for combat_corrected
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing exon_reads with data
Adding counts for exon_reads
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing exon_umis with data
Adding counts for exon_umis
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing intron_reads with data
Adding counts for intron_reads
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing intron_umis with data
Adding counts for intron_umis
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing starcounts with data
Adding counts for starcounts
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing umi_merge with data
Adding counts for umi_merge
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Error: Missing required datasets 'levels' and 'values'

When I add false to metadata, the error " Missing required datasets 'levels' and 'values' was "solved", but another appears: Error in if (!x[[i]]$dims) { : argument is of length zero And this was solved with false to "misc". Honestly, I don't know what it was about, but at least I got my object. Hope this could help someone :)

--Update--

Ok, obvs that when you're setting meta.data to false, some metadata will not load. But I figured out, that it may be only "pure" metadata, stored in "obs". So, here are some addition to loading it with a different package:

library(rhdf5)
my_obj[["mised_meta_value"]] <- h5read("my_obj.h5ad", "/obs/mised_meta_value")

And if you want to see the structure of the h5ad file (with R), you can use ls-like command, also from this package:

h5ls("my_obj.h5ad")

So it seems that from this point only ''misc" information is missing, but in my case, there is no info.

tomthun commented 2 years ago

Alternativly, you can use zellkonverter to read in your anndata as a SingleCellExperiment with
ad <- readH5AD(path_to/example_h5ad) Then you can use Seurat's function as.Seurat() to convert your object to Seurat. I also had to specify the default parameter counts and data to fit my data. E.g. i had to specify adata_Seurat <- as.Seurat(ad, counts = "X", data = NULL) You can find the name of your counts by omiting ad and look under the column assays. If you have logcounts you need to do the same for data and reference the correct column. I hope that helps! :)

naity2 commented 2 years ago

Same issue, but it looks like the package is not maintained anymore.

bclopesrs commented 2 years ago

I have been struggling against it for months. That piss me off. Could anyone troubleshoot that? It's unbelievable I'm still stuck on it.

rockdeme commented 2 years ago

I have been struggling against it for months. That piss me off. Could anyone troubleshoot that? It's unbelievable I'm still stuck on it.

As an alternative you can use the anndata for R package and build the seurat object from that

AmirAlavi commented 2 years ago

Worked around this with @rockdeme's suggestion. Just use anndata, you'll have to get reticulate and other dependencies but it's worth it.

Example:

library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))
bclopesrs commented 2 years ago

Thank you guys for your support but...

Still getting the same error: "Missing required datasets 'levels' and 'values"

Here is my code:

Convert("full_integrated_test.h5ad", dest = "h5seurat", overwrite = TRUE, verbose = T) test01 <- LoadH5Seurat("full_integrated_test.h5seurat", array = 'RNA') test01

bettega = LoadH5Seurat("full_integrated_test.h5seurat") sce = as.SingleCellExperiment(bettega) scConf = createConfig(sce) makeShinyApp(sce, scConf, gene.mapping = TRUE, shiny.title = "ShinyCell Quick Start")

YY-SONG0718 commented 2 years ago

I got the same issue. I saw everyone mentioning anndata version and so I tried 0.7.5 and seems to be no issue for now.

Same issue here, using anndata 0.7.5 in python 3.9.13 resolved the issue.

gt7901b commented 1 year ago

I tried seuratdisk and sceasy, all gave me error messages

@tomthun's method worked for me.

use zellkonverter to read in your anndata as a SingleCellExperiment, then convert SCE to seurat worked for me.

library(zellkonverter) sce1=readH5AD("my.h5ad", verbose = TRUE) adata_Seurat <- as.Seurat(sce1, counts = "X", data = NULL)

niehu2018 commented 1 year ago

Try this: In scanpy, del adata.var del adata.obs then save h5ad

In R, then use Convert to convert h5ad to seurat

This method worked for me.

You can then add meta data in R

ghar1821 commented 1 year ago

I'm having the same issue converting an anndata h5ad that came from version 0.7.8. @JBreunig what version of anndata is your postdoc using?

0.7.6 and 1.8.1

Had the same problem, and only this solution works (downgrade anndata to 0.7.6 and rewrite the h5ad file). Actually, I have had to do 2 lots of transformation before re-writing to get h5ad file that seurat likes - following the solution for this thread: https://github.com/satijalab/seurat/issues/1689

adata.T.T.write_h5ad("test.h5ad")

pinunQ commented 1 year ago

https://github.com/mojaveazure/seurat-disk/issues/109#issuecomment-1137812860

Hey,

So I had the same problem as you had and I was able to fix in a different way than downgrading anndata or changing package.

I did it like this,

Python -

Write the anndata meta data into csv formate

peri.write_csvs("peri", skip_data=False)

Write the anndata to h5ad

peri.write_h5ad("~/peri.h5ad")

R (Seurat)

Convert h5ad to h5seurat

Convert("~/Data/scRNA/merged/peri.h5ad", dest="h5seurat", overwrite = TRUE)

peri_meta <- read.csv("~/peri/obs.csv")

Making metadata rownames to cell barcodes

rownames(peri_meta) <- peri_meta$X

Selecting seats from column 1 to last except cell barcodes

peri_meta <- peri_meta[,c(2:146)]

then... load Seurat object with previous suggestion by @gleb-gavrish

peri = LoadH5Seurat("~/peri.h5seurat", meta.data = FALSE, misc = FALSE)

This still does not work

peri[["Condition"]] <- h5read("~/peri.h5ad", "/obs/Condition")

Finally add the metadata to the object

peri <- AddMetaData(
  object = peri,
  metadata = peri_meta)

Cheers, Sonik

JZL commented 1 year ago

I also had trouble like this. I recommend downloading the github, modifying the script to add browser() calls to where the error comes from/debug in place, and using devtools::load_all to reload any modifications to the code to see if it works

For me, the main issue which causes this same error was using scanpy categories. I don't understand why some of categorical variables use categories+codes and others use levels+values as the h5 name but it was fixable. Additionally, paga networks weren't supported and I didn't need them so I disabled that

diff --git a/R/ReadH5.R b/R/ReadH5.R
index 4c169de..6075020 100644
--- a/R/ReadH5.R
+++ b/R/ReadH5.R
@@ -145,6 +145,14 @@ setMethod(
   f = 'as.factor',
   signature = c('x' = 'H5Group'),
   definition = function(x) {
+    if (x$exists(name = 'categories') && x$exists(name = 'codes')) {
+      # stop("Missing required datasets 'levels' and 'codes'", call. = FALSE)
+      ret = as.factor(x[["codes"]][])
+      levels(ret) = x[["categories"]][]
+      print(length(ret))
+      return(ret)
+      # arguments imply differing number of rows: 770951, 5965
+    }
     if (!x$exists(name = 'levels') || !x$exists(name = 'values')) {
       stop("Missing required datasets 'levels' and 'values'", call. = FALSE)
     }
@@ -245,7 +253,11 @@ setMethod(
         } else if (IsMatrix(x = x[[i]])) {
           as.matrix(x = x[[i]], ...)
         } else {
-          as.list(x = x[[i]], ...)
+          if(i == "paga"){
+            list("nada")
+          }else{
+            as.list(x = x[[i]], ...)
+          }
         }
       }
     }
cchd0001 commented 1 year ago

I found that any string column in the obs DataFrame will cause this issue, I drop all string columns and re-add them in Seurat by R language.

denvercal1234GitHub commented 1 year ago

@UboCA @rockdeme and @AmirAlavi -- the t in the CreateSeuratObject()? This step threw an error saying my adata$X is not a matrix. Do I need to convert the dgRMatrix to matrix?

Thank you

adata_Seurat <- CreateSeuratObject(counts = t(adata$X), meta.data = cadata$obs)

Error in t.default(count302_Ton230240246_CD8R5posneg_chrMTGTF_concat_adata$X) : 
  argument is not a matrix
denvercal1234GitHub commented 1 year ago

@ghar1821 - Did you mean in Python, you downgraded anndata to 0.7.6, then do adata.T.T.write_h5ad before using Convert and LoadH5Seurat in R?

I did that, and in R, when I LoadH5Seurat, it said "Warning: Invalid name supplied, making object name syntactically valid. New object name is ClustersX_XX_Ybatch; see ?make.names for more details on syntax validityAdding miscellaneous information. Adding tool-specific results." Do you know if this is normal?

BTW, if doing just adata.write_h5ad appeared to produce the same result as with T.T.

> adata_Seurat <- LoadH5Seurat("........_Objects/concat_adata.h5seurat")

Validating h5Seurat file
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Initializing ambiguous with data
Adding counts for ambiguous
Initializing matrix with data
Adding counts for matrix
Initializing spliced with data
Adding counts for spliced
Initializing unspliced with data
Adding counts for unspliced
Adding command information
Adding cell-level metadata
Warning: Invalid name supplied, making object name syntactically valid. New object name is ClustersX_XX_Ybatch; see ?make.names for more details on syntax validityAdding miscellaneous information
Adding tool-specific results

I'm having the same issue converting an anndata h5ad that came from version 0.7.8. @JBreunig what version of anndata is your postdoc using?

0.7.6 and 1.8.1

Had the same problem, and only this solution works (downgrade anndata to 0.7.6 and rewrite the h5ad file). Actually, I have had to do 2 lots of transformation before re-writing to get h5ad file that seurat likes - following the solution for this thread: satijalab/seurat#1689

adata.T.T.write_h5ad("test.h5ad")

denvercal1234GitHub commented 1 year ago

hi @pinunQ - peri = LoadH5Seurat("~/peri.h5seurat", meta.data = FALSE, misc = FALSE) still gave the same error. Do you know why?

shuailinli commented 1 year ago

Ok, I had the same issue, but managed to load the file. My solution was just adding two "FALSE" to some flags:

my_obj <- LoadH5Seurat("my_obj.h5seurat", meta.data = FALSE, misc = FALSE)

The initial error was this:

Validating h5Seurat file
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing RNA with data
Adding counts for RNA
Adding feature-level metadata for RNA
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing combat_corrected with data
Adding counts for combat_corrected
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing exon_reads with data
Adding counts for exon_reads
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing exon_umis with data
Adding counts for exon_umis
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing intron_reads with data
Adding counts for intron_reads
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing intron_umis with data
Adding counts for intron_umis
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing starcounts with data
Adding counts for starcounts
Warning: Feature names cannot have underscores ('_'), replacing with dashes ('-')
Initializing umi_merge with data
Adding counts for umi_merge
Adding reduction pca
Adding cell embeddings for pca
Adding feature loadings for pca
Adding miscellaneous information for pca
Adding reduction umap
Adding cell embeddings for umap
Adding miscellaneous information for umap
Adding command information
Adding cell-level metadata
Error: Missing required datasets 'levels' and 'values'

When I add false to metadata, the error " Missing required datasets 'levels' and 'values' was "solved", but another appears: Error in if (!x[[i]]$dims) { : argument is of length zero And this was solved with false to "misc". Honestly, I don't know what it was about, but at least I got my object. Hope this could help someone :)

--Update--

Ok, obvs that when you're setting meta.data to false, some metadata will not load. But I figured out, that it may be only "pure" metadata, stored in "obs". So, here are some addition to loading it with a different package:

library(rhdf5)
my_obj[["mised_meta_value"]] <- h5read("my_obj.h5ad", "/obs/mised_meta_value")

And if you want to see the structure of the h5ad file (with R), you can use ls-like command, also from this package:

h5ls("my_obj.h5ad")

So it seems that from this point only ''misc" information is missing, but in my case, there is no info.

Hi, thanks for your solution. I think this is an inherent incompatibility between Seurat and Anndata. When Anndata write h5ad file using write_h5ad, it will convert all string variables into categorical variables, which will be processed in Seurat as a factor. LoadH5Seurat will call an internal function as.Seurat in R, in which as.factor function in R is called. as.factor function is expecting to work on a vector, but the as.Seurat input a data type called HDF5 group into as.factor, which cause the problem. I think the team should fix the problem.

st-tky commented 1 year ago

For someone's information, installing anndata==0.7.5 worked for me, as mentioned before.

bobermayer commented 1 year ago

I was able to use @gleb-gavrish's solution and then add meta data like so (accounting for NA's in some of my categorical obs columns):

library(rhdf5)
sobj <- LoadH5Seurat(file="my_object.h5seurat",  meta.data = FALSE, misc = FALSE)
obs <- h5read("my_object.h5seurat", "/meta.data")

meta <- data.frame(lapply(names(obs), function(x) { 
  if (length(obs[[x]])==2) 
    obs[[x]][['categories']][ifelse(obs[[x]][['codes']] >= 0, obs[[x]][['codes']] + 1, NA)]
  else 
    as.numeric(obs[[x]])
}
), row.names=Cells(sobj))
colnames(meta) <- names(obs)

sobj <- AddMetaData(sobj,meta)
george-hall-ucl commented 1 year ago

Worked around this with @rockdeme's suggestion. Just use anndata, you'll have to get reticulate and other dependencies but it's worth it.

Example:

library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))

This didn't work for me (anndata v0.7.5.2, Seurat v4.1.0, R v4.0.3), but the solution of using zellconverter -- suggested by @tomthun earlier in this thread (https://github.com/mojaveazure/seurat-disk/issues/109#issuecomment-1115959604) -- does the job nicely.

cchd0001 commented 1 year ago
library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))

Thanks, this works for me. For h5ad with a sparse matrix, I modify this line: data <- CreateSeuratObject(counts = t(as.matrix(data$X)), meta.data = data$obs)

Huahuatii commented 1 year ago
library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))

Thanks, this works for me. For h5ad with a sparse matrix, I modify this line: data <- CreateSeuratObject(counts = t(as.matrix(data$X)), meta.data = data$obs)

I struggled with this problem for several days, and see the ERROR: Error in t.default(data$X): argument is not a matrix THANK YOU FOR as.matrix

mxposed commented 10 months ago

Here's the code to fix h5seurat object after Convert

library(hdf5r)

f <- H5File$new("<YOUR_FILE>.h5seurat", "r+")
groups <- f$ls(recursive = TRUE)

for (name in groups$name[grepl("categories", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "levels")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
}

for (name in groups$name[grepl("codes", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "values")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
  grp <- f[[new_name]]
  grp$write(args = list(1:grp$dims), value = grp$read() + 1)
}

f$close_all()
huidongchen commented 10 months ago

@UboCA @rockdeme and @AmirAlavi -- the t in the CreateSeuratObject()? This step threw an error saying my adata$X is not a matrix. Do I need to convert the dgRMatrix to matrix?

Thank you

adata_Seurat <- CreateSeuratObject(counts = t(adata$X), meta.data = cadata$obs)

Error in t.default(count302_Ton230240246_CD8R5posneg_chrMTGTF_concat_adata$X) : 
  argument is not a matrix

You simply need to add library(Matrix)

This works for me.

library(Seurat)
library(anndata)
library(Matrix)
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))
teryyoung commented 9 months ago

Here's the code to fix h5seurat object after Convert

f <- H5File$new("<YOUR_FILE>.h5seurat", "r+")
groups <- f$ls(recursive = TRUE)

for (name in groups$name[grepl("categories", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "levels")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
}

for (name in groups$name[grepl("codes", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "values")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
  grp <- f[[new_name]]
  grp$write(args = list(1:grp$dims), value = grp$read() + 1)
}

f$close_all()

@mxposed could you please tell me where the function?H5Flie from? by run your code, i got this: Error in eval(expr, envir, enclos): object 'H5File' not found

mxposed commented 9 months ago

@mxposed could you please tell me where the function?H5Flie from? by run your code, i got this: Error in eval(expr, envir, enclos): object 'H5File' not found

@teryyoung this is from package hdf5r, so it's library(hdf5r)

JessicaKLL commented 8 months ago

Here's the code to fix h5seurat object after Convert

library(hdf5r)

f <- H5File$new("<YOUR_FILE>.h5seurat", "r+")
groups <- f$ls(recursive = TRUE)

for (name in groups$name[grepl("categories", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "levels")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
}

for (name in groups$name[grepl("codes", groups$name)]) {
  names <- strsplit(name, "/")[[1]]
  names <- c(names[1:length(names) - 1], "values")
  new_name <- paste(names, collapse = "/")
  f[[new_name]] <- f[[name]]
  grp <- f[[new_name]]
  grp$write(args = list(1:grp$dims), value = grp$read() + 1)
}

f$close_all()

Hi, thanks for the code but now I have this error...

Error: 'levels' must be a one-dimensional string dataset

mxposed commented 8 months ago

@JessicaKLL

Error: 'levels' must be a one-dimensional string dataset

when you're saving h5ad, please make sure your categorical fields are categories over strings, not over numbers

dennishamrick commented 3 months ago

Worked around this with @rockdeme's suggestion. Just use anndata, you'll have to get reticulate and other dependencies but it's worth it.

Example:


library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))
`

This worked yesterday for me, then today when I attempted to do it again it is not working. Doesn't even work on the same h5ad that worked yesterday! The error I get is:

Warning: Data is of class dgRMatrix. Coercing to dgCMatrix.

Error: Expected a python object, received a character

The Warning message was present yesterday but didn't affect it, but the error is new. Any ideas?

sidwekhande commented 3 months ago

Worked around this with @rockdeme's suggestion. Just use anndata, you'll have to get reticulate and other dependencies but it's worth it. Example:

library("Seurat")
library("anndata")
print("Convert from Scanpy to Seurat...")
data <- read_h5ad("example.hd5ad")
data <- CreateSeuratObject(counts = t(data$X), meta.data = data$obs)
print(str(data))
`

This worked yesterday for me, then today when I attempted to do it again it is not working. Doesn't even work on the same h5ad that worked yesterday! The error I get is:

Warning: Data is of class dgRMatrix. Coercing to dgCMatrix.

Error: Expected a python object, received a character

The Warning message was present yesterday but didn't affect it, but the error is new. Any ideas?

@dennishamrick I faced a similar issue - I think there's a change in the way data$obs is stored. I was able to create the SeuratObject by not passing the meta.data parameter in CreateSeuratObject.

dennishamrick commented 2 months ago

https://github.com/scverse/anndataR worked for me to convert h5ad -> Seurat object. It was very slow however (took a few hours for an 8 gb h5ad) but it worked eventually. Code I used was

library(anndataR)

adata <- read_h5ad("YOURH5AD.h5ad", to = "InMemoryAnnData")
SeuratObject <- adata$to_Seurat()