theislab / zellkonverter

Conversion between scRNA-seq objects
https://theislab.github.io/zellkonverter/
Other
149 stars 27 forks source link

h5ad duplicate rownames #51

Closed kbrulois closed 3 years ago

kbrulois commented 3 years ago

Is it possible to convert a h5ad file with duplicate row names using this package?

Trying to convert a single cell dataset available here:

https://zenodo.org/record/3711134#.YKQ8RS2cZTY

using the code below:


 local.dir = "~/Desktop/Teichmann"
  h5ad_file = "HTA08.v01.A05.Science_human_fig1.h5ad"
  dir.create(local.dir)
  setwd(local.dir)
  system("curl -O https://zenodo.org/record/3711134/files/thymus_annotated_matrix_files.zip")
  untar(list.files())
  Teichmann_data <- zellkonverter::readH5AD(h5ad_file)

Variable names are not unique. To make them unique, call .var_names_make_unique. Warning messages: 1: In .extract_or_skip_assay(skip_assays = skip_assays, hdf5_backed = hdf5_backed, : 'X' matrix does not support transposition and has been skipped 2: In py_to_r.pandas.core.frame.DataFrame(x) : index contains duplicated values: row names not set

The code runs with a warning (but no error) and the resulting sce has an assay with 0 entries.

lazappi commented 3 years ago

Hi @kbrulois

Thanks for the report! Can you please let us know which version of zellkonverter you are using? I think there are actually a few things going on here but I will see if we can look into this file and work it out.

kbrulois commented 3 years ago

Thank you!

I'm using zellkonverter_1.0.3

lazappi commented 3 years ago

Sorry for the lack of progress on this @kbrulois. I will try to get to it soon. If you are able to try with the latest version of zellkonverter (v1.3.1) that would be a big help.

kbrulois commented 3 years ago

Hi @lazappi,

I was able to load this file successfully using v1.3.1.

Thank you!