theislab / zellkonverter

Conversion between scRNA-seq objects
https://theislab.github.io/zellkonverter/
Other
149 stars 27 forks source link

Error reading `raw` with `use_hdf5 = TRUE` #123

Closed GabrielHoffman closed 4 weeks ago

GabrielHoffman commented 2 months ago

When trying to read the raw entry in an h5ad file, this call to .extract_or_skip_assay() does not specify a value for filepath and so throws an error argument "filepath" is missing, with no default

https://github.com/theislab/zellkonverter/blob/328e7ea895196dbac6229ad3076e9b6ed00b7afe/R/AnnData2SCE.R#L302

It seems like adding

filepath = as.character(py_to_r(adata$file$filename))

would solve the issue.

sessionInfo()
R version 4.3.3 (2024-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /hpc/packages/minerva-centos7/oneAPI/p_2024.1.0.560/toolkits/mkl/2024.1/lib/libmkl_gf_lp64.so.2;  LAPACK version 3.11.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

time zone: America/New_York
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] zellkonverter_1.13.3

loaded via a namespace (and not attached):
 [1] crayon_1.5.2                cli_3.6.2
 [3] rlang_1.1.4                 png_0.1-8
 [5] jsonlite_1.8.8              DelayedArray_0.28.0
 [7] dir.expiry_1.10.0           SummarizedExperiment_1.32.0
 [9] S4Vectors_0.40.2            RCurl_1.98-1.14
[11] stats4_4.3.3                MatrixGenerics_1.14.0
[13] Biobase_2.62.0              grid_4.3.3
[15] filelock_1.0.3              abind_1.4-5
[17] bitops_1.0-7                SingleCellExperiment_1.24.0
[19] IRanges_2.36.0              basilisk_1.14.3
[21] GenomeInfoDb_1.38.8         compiler_4.3.3
[23] Rcpp_1.0.12                 XVector_0.42.0
[25] lattice_0.22-5              reticulate_1.37.0
[27] SparseArray_1.2.4           parallel_4.3.3
[29] GenomeInfoDbData_1.2.11     GenomicRanges_1.54.1
[31] Matrix_1.6-5                tools_4.3.3
[33] matrixStats_1.3.0           zlibbioc_1.48.2
[35] S4Arrays_1.2.1              basilisk.utils_1.14.1
[37] BiocGenerics_0.48.1
lazappi commented 2 months ago

I think you are right and I probably made a mistake last time I updated that function but for completeness can you please post the command you ran with the output/error?

If you are interested in submitting a PR for this that would also be great.

GabrielHoffman commented 2 months ago

Here is a reproducible error. I'm working on a fix now

library(zellkonverter)

# wget https://datasets.cellxgene.cziscience.com/4e6932db-5a78-40e4-b961-f87f66ba139a.h5ad

file = "4e6932db-5a78-40e4-b961-f87f66ba139a.h5ad"
sce = readH5AD(file, use_hdf5=TRUE, raw=TRUE)

Error in .extract_or_skip_assay(skip_assays = skip_assays, hdf5_backed = hdf5_backed,  : 
  argument "filepath" is missing, with no default
sessionInfo()

sessionInfo()
R version 4.4.0 (2024-04-24)
Platform: aarch64-apple-darwin23.5.0
Running under: macOS Sonoma 14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Users/gabrielhoffman/prog/R-4.4.0/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] zellkonverter_1.15.1

loaded via a namespace (and not attached):
 [1] Matrix_1.7-0                jsonlite_1.8.8             
 [3] compiler_4.4.0              crayon_1.5.3               
 [5] filelock_1.0.3              Rcpp_1.0.12                
 [7] SummarizedExperiment_1.34.0 Biobase_2.64.0             
 [9] GenomicRanges_1.56.1        rhdf5filters_1.16.0        
[11] parallel_4.4.0              IRanges_2.38.1             
[13] png_0.1-8                   reticulate_1.38.0          
[15] lattice_0.22-6              R6_2.5.1                   
[17] XVector_0.44.0              S4Arrays_1.4.1             
[19] GenomeInfoDb_1.40.1         BiocGenerics_0.50.0        
[21] DelayedArray_0.30.1         MatrixGenerics_1.16.0      
[23] GenomeInfoDbData_1.2.12     rlang_1.1.4                
[25] HDF5Array_1.32.0            dir.expiry_1.12.0          
[27] SparseArray_1.4.8           cli_3.6.3                  
[29] withr_3.0.0                 Rhdf5lib_1.26.0            
[31] zlibbioc_1.50.0             grid_4.4.0                 
[33] basilisk_1.16.0             rhdf5_2.48.0               
[35] S4Vectors_0.42.1            SingleCellExperiment_1.26.0
[37] abind_1.4-5                 stats4_4.4.0               
[39] httr_1.4.7                  basilisk.utils_1.16.0      
[41] matrixStats_1.3.0           tools_4.4.0                
[43] UCSC.utils_1.0.0   
GabrielHoffman commented 2 months ago

Confirm pull request? https://github.com/theislab/zellkonverter/pull/124

lazappi commented 2 months ago

Thanks. Did you mean to close the PR?

GabrielHoffman commented 2 months ago

I didn't mean to. I re-opened it now. Let me know if you have another issue

Best, Gabriel

GabrielHoffman commented 1 month ago

Were you able to resolve this?

Best, Gabriel

lazappi commented 1 month ago

Sorry, I haven't had a chance to look at it yet with other things going on. I will try to get to it.

lazappi commented 1 month ago

I was worried this would be something specific to more recent anndata versions but I have checked with the different environments and it is consistent so I will review the PR now.