Problems reading in the output npz file

aakrosh commented 4 years ago

When I read in the output file npz file (sumo_results.npz) using reticulate, I am unable to print the contents of the file named "clusters". I am able to print the contents of all the other files including "quality", "consensus", cophenet", "unfiltered_consensus" and "summary". To recreate this issue, you should be able to do the following

library(survival)
library(survminer)
library(reticulate)
np <- import("numpy")
data <- np$load("sumo_results.npz")
data$files
data$f["clusters"]

The above should fail with the error

Error: Python object has no '__getitem__' method

I am using python 3.7.0, and here is the output of my sessionInfo() in R

R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.2.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] reticulate_1.13   survminer_0.4.6   ggpubr_0.2.3      magrittr_1.5     
[5] ggplot2_3.2.1     survival_2.44-1.1

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.2        pillar_1.4.2      compiler_3.6.1    tools_3.6.1      
 [5] zeallot_0.1.0     jsonlite_1.6      tibble_2.1.3      lifecycle_0.1.0  
 [9] gtable_0.3.0      nlme_3.1-141      lattice_0.20-38   pkgconfig_2.0.3  
[13] rlang_0.4.0       Matrix_1.2-17     xfun_0.10         gridExtra_2.3    
[17] withr_2.1.2       dplyr_0.8.3       knitr_1.25        generics_0.0.2   
[21] vctrs_0.2.0       survMisc_0.5.5    grid_3.6.1        tidyselect_0.2.5 
[25] data.table_1.12.4 glue_1.3.1        R6_2.4.0          KMsurv_0.1-5     
[29] km.ci_0.5-2       purrr_0.3.2       tidyr_1.0.0       scales_1.0.0     
[33] backports_1.1.5   splines_3.6.1     assertthat_0.2.1  xtable_1.8-4     
[37] colorspace_1.4-1  ggsignif_0.6.0    lazyeval_0.2.2    munsell_0.5.0    
[41] broom_0.5.2       crayon_1.3.4      zoo_1.8-6

sienkie commented 4 years ago

I have encountered this issue before. "reticulate" has some problems with files pickled with python3. To deal with this you should specify python3 before attaching the "reticulate" package and allow for pickled files when loading the npz file.

reticulate::use_python(Sys.which('python3'), required = TRUE)
library(reticulate)
np <- import("numpy")
data <- np$load("sumo_results.npz", allow_pickle = T)
data$files
data$f["clusters"]

aakrosh commented 4 years ago

Thanks. That works perfectly.

ratan-lab / sumo

Problems reading in the output npz file #6