gagneurlab / FRASER

FRASER - Find RAre Splicing Events in RNA-seq
MIT License
36 stars 20 forks source link

loadFraserDataset failing if RNASeq data counts generated and loaded on different workstations #16

Closed h-joshi closed 3 years ago

h-joshi commented 3 years ago

Steps to reproduce:

Expected: fds dataset is successfully loaded

Actual: The following error is thrown Error: no slot of name "seed" for this object of class "HDF5ArraySeed"

Code issue: https://github.com/c-mertes/FRASER/blob/faa0a207156e9d575dcbb61d774d16e6d24f9371/R/saveHDF5Objects.R#L114

From what I understand, the logic (just preceding the line above) checks if the HDF datasets are available (as per the fds object), otherwise they are loaded from the specified 'dir' parameter. This bug is surfaced in the R major version > 3 portion.

The slot( .... "seed") directive doesn't need another @seed accessor. That's what causing this scenario to fail

My crude fix:

    else if (afile != path(assay(fds, aname, withDimnames = FALSE))) {
      if (R.Version()$major == "3") {
        path(assay(fds, aname, withDimnames = FALSE)) <- afile
      }
      else {
        if ("seed" %in% slotNames(assay(fds, aname, withDimnames = FALSE))) {
          if ("filepath" %in% slotNames(slot(assay(fds, aname, withDimnames = FALSE), "seed"))) {
            slot(assay(fds, aname, withDimnames = FALSE), "seed")@filepath <- afile
          }
        }

        if ("seed" %in% slotNames(assay(fds, aname, withDimnames = FALSE))) {
          if ("seed" %in% slotNames(slot(assay(fds, aname, withDimnames = FALSE), "seed"))) {
            if ("filepath" %in% slotNames(slot(slot(assay(fds, aname, withDimnames = FALSE), "seed"), "seed"))) {
              slot(assay(fds, aname, withDimnames = FALSE), "seed")@seed@filepath <- afile
            }
          }
        }
      }
    }

Sessioninfo:

sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ggpubr_0.4.0                ggplot2_3.3.2               FRASER_1.0.2                SummarizedExperiment_1.18.2
 [5] DelayedArray_0.14.1         matrixStats_0.57.0          Biobase_2.48.0              Rsamtools_2.4.0            
 [9] Biostrings_2.56.0           XVector_0.28.0              GenomicRanges_1.40.0        GenomeInfoDb_1.24.2        
[13] IRanges_2.22.2              S4Vectors_0.26.1            BiocGenerics_0.34.0         data.table_1.13.0          
[17] BiocParallel_1.22.0        

loaded via a namespace (and not attached):
  [1] VGAM_1.1-3                colorspace_1.4-1          ggsignif_0.6.0            ellipsis_0.3.1            rio_0.5.16               
  [6] rstudioapi_0.11           ggrepel_0.8.2             bit64_4.0.5               AnnotationDbi_1.50.3      fansi_0.4.1              
 [11] splines_4.0.2             R.methodsS3_1.8.1         PRROC_1.3.1               knitr_1.30                jsonlite_1.7.1           
 [16] broom_0.7.1               dbplyr_1.4.4              R.oo_1.24.0               pheatmap_1.0.12           HDF5Array_1.16.1         
 [21] compiler_4.0.2            httr_1.4.2                backports_1.1.10          assertthat_0.2.1          Matrix_1.2-18            
 [26] lazyeval_0.2.2            cli_2.0.2                 htmltools_0.5.0           prettyunits_1.1.1         tools_4.0.2              
 [31] gtable_0.3.0              glue_1.4.2                GenomeInfoDbData_1.2.3    dplyr_1.0.2               rappdirs_0.3.1           
 [36] Rcpp_1.0.5                carData_3.0-4             cellranger_1.1.0          vctrs_0.3.4               rtracklayer_1.48.0       
 [41] DelayedMatrixStats_1.10.1 xfun_0.18                 stringr_1.4.0             openxlsx_4.2.2            lifecycle_0.2.0          
 [46] rstatix_0.6.0             XML_3.99-0.5              zlibbioc_1.34.0           scales_1.1.1              BSgenome_1.56.0          
 [51] pcaMethods_1.80.0         hms_0.5.3                 rhdf5_2.32.4              RColorBrewer_1.1-2        BBmisc_1.11              
 [56] yaml_2.2.1                curl_4.3                  memoise_1.1.0             biomaRt_2.44.1            stringi_1.5.3            
 [61] RSQLite_2.2.1             checkmate_2.0.0           GenomicFeatures_1.40.1    zip_2.1.1                 Rsubread_2.2.6           
 [66] rlang_0.4.8               pkgconfig_2.0.3           bitops_1.0-6              evaluate_0.14             lattice_0.20-41          
 [71] purrr_0.3.4               Rhdf5lib_1.10.1           GenomicAlignments_1.24.0  htmlwidgets_1.5.2         cowplot_1.1.0            
 [76] bit_4.0.4                 tidyselect_1.1.0          magrittr_1.5              R6_2.4.1                  generics_0.0.2           
 [81] DBI_1.1.0                 pillar_1.4.6              haven_2.3.1               foreign_0.8-80            withr_2.3.0              
 [86] abind_1.4-5               RCurl_1.98-1.2            tibble_3.0.3              crayon_1.3.4              car_3.0-10               
 [91] utf8_1.1.4                BiocFileCache_1.12.1      plotly_4.9.2.1            rmarkdown_2.4             progress_1.2.2           
 [96] grid_4.0.2                readxl_1.3.1              blob_1.2.1                forcats_0.5.0             digest_0.6.25            
[101] tidyr_1.1.2               extraDistr_1.9.1          R.utils_2.10.1            openssl_1.4.3             munsell_0.5.0            
[106] viridisLite_0.3.0         askpass_1.1        
c-mertes commented 3 years ago

Dear @h-joshi thanks for reporting this. I think this is relate to #11. I'm not sure if this is a code problem or rather a version problem as we do test exactly for this here: https://github.com/c-mertes/FRASER/blob/a450eb65/tests/testthat/test_write_read_objects.R#L25

As you switch computers do you have the same version on both machines? It could be a difference in HDF5Array since they changed at some point the internal representation. Can you please check if the HDF5 related packages have the same version.

If it turns out to be a package version number I would rather check for the package then checking for an extra slot within the package.

c-mertes commented 3 years ago

In the last build this popped up here too now. So I still did not get my head around it, but it looks like the HDF5 arrays are sometimes saved as DelayedMatrix object or sometimes as HDF5Array since those are different classes the slots are different. Hence the differences.

@h-joshi can you please check if the new code fixes your problem? Be aware, that this branch changed a bit since the last version. But the api should be almost the same.

h-joshi commented 3 years ago

Hi - Is a pre-built binary available for this patch? I'm running this on a RStudio server environment (so unfortunately can't installed the build dependencies) in particular gfortran

c-mertes commented 3 years ago

@h-joshi I pushed it into the new bioc release. So you should have a windows binary there. http://bioconductor.org/packages/release/bioc/html/FRASER.html

If you are running on a Linux system. I do not have a rebuild version. Maybe bioconda could be of help here.