Closed johnscrn closed 2 years ago
Hi @johnscrn
This seems a bit weird, not sure exactly what was going on. I was able to read and convert the file fine (big thanks for providing the file by the way).
> url <- "https://storage.googleapis.com/gtex_analysis_v9/snrna_seq_data/GTEx_8_tissues_snRNAseq_atlas_071421.public_obs.h5ad"
> temp <- tempfile(fileext = ".h5ad")
> curl::curl_download(url, temp)
> zellkonverter::readH5AD(temp, verbose=T, layers=F, varm=F, obsm=F, varp=F, obsp=F, uns=F)
ℹ Using the Python reader
✓ Read /.../.../rj/.../T/.../file497b6e4aab4c.h5ad [28.2s]
ℹ Skipping conversion of uns
✓ X matrix converted to assay [57.1s]
ℹ Skipping conversion of layers
ℹ Skipping conversion of varm
ℹ Skipping conversion of obsm
ℹ Skipping conversion of varp
ℹ Skipping conversion of obsp
✓ SingleCellExperiment constructed [3.5s]
ℹ Skipping conversion of raw
✓ Converting AnnData to SingleCellExperiment ... done
class: SingleCellExperiment
dim: 17695 209126
metadata(0):
assays(1): X
rownames(17695): FO538757.2 SAMD11 ... S100B PRMT2
rowData names(18): gene_ids Chromosome ... gene_include n_cells
colnames(209126): CST01_TAGGCATGTAAATACG-skeletalmuscle
CST01_CCTTACGTCCGTCAAA-skeletalmuscle ... TST03_CACAGGCGTACATCCA-skin
TST03_GACCAATTCCAGTATG-skin
colData names(47): n_genes fpr ... Tissue channel
reducedDimNames(0):
mainExpName: NULL
altExpNames(0):
It seems like maybe {reticulate} is trying to convert something that it thinks is 'integer'
but is actually 'double'
. Not really sure why though. I wonder if maybe it is a platform thing? Would you be able to either a) confirm the same error on another Windows machine or b) try the same file on Linux/MacOS and see if that works?
Thank you for looking into this. I get the same error from my personal computer and would have tried linux but ran into R update issues (that I cannot deal with right now). I was able to convert the file after removing all the unnecessary annotation from anndata. If I get some time I'll add things back and see if I can't figure out which annotation was the issue.
since you cannot recreate you can close this issue. If anyone else comes here after downloading the new GTEx tissue atlas https://www.gtexportal.org/home/datasets ... just delete all but what you absolutely need from the anndata and resave to h5ad.
Thanks again.
Thanks, I'm guessing it might be a Windows thing then. Just confirming that this is a public dataset? If so we can add it to our set of tests to a) confirm the issue and b) hopefully come up with a fix.
Yes it is public. The link I gave in the last comment will take you to their project page. Here is their data use statement: "All datasets from phs00424.v5.p1 forward will follow the NIH GDS policy. This means that once released through dbGaP, there are no restrictions on use or publication. This document and an accompanying table of dataset releases can be found at http://www.gtexportal.org/home/documentationPage ."
This is now included in the test suite from the latest release and there don't seem to have been any issues. I'm going to close this but if you are still having issues with the latest {zellkonverter} version please reopen.
I have an .h5ad file I can load into python and seems to work with no issues. I want to instead read it into R. Here is the code to reproduce my issue:
Output:
.... I'd like to keep the obs and var but tried removing them in case one of them was the issue. Can anyone point me in the right direction?
Thank you!
Session info
```r sessionInfo() R version 4.1.2 (2021-11-01) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 [4] LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] zellkonverter_1.4.0 loaded via a namespace (and not attached): [1] Rcpp_1.0.7 XVector_0.34.0 [3] GenomicRanges_1.46.1 BiocGenerics_0.40.0 [5] zlibbioc_1.40.0 IRanges_2.28.0 [7] here_1.0.1 lattice_0.20-45 [9] GenomeInfoDb_1.30.0 tools_4.1.2 [11] parallel_4.1.2 SummarizedExperiment_1.24.0 [13] grid_4.1.2 Biobase_2.54.0 [15] png_0.1-7 cli_3.1.0 [17] basilisk_1.6.0 matrixStats_0.61.0 [19] rprojroot_2.0.2 Matrix_1.3-4 [21] dir.expiry_1.2.0 GenomeInfoDbData_1.2.7 [23] BiocManager_1.30.16 S4Vectors_0.32.3 [25] bitops_1.0-7 basilisk.utils_1.6.0 [27] RCurl_1.98-1.5 SingleCellExperiment_1.16.0 [29] glue_1.5.1 DelayedArray_0.20.0 [31] compiler_4.1.2 filelock_1.0.2 [33] MatrixGenerics_1.6.0 stats4_4.1.2 [35] jsonlite_1.7.2 reticulate_1.22 ```