Bioconductor / RaggedExperiment

Matrix-like representations of mutation and CN data
https://bioconductor.org/packages/RaggedExperiment
4 stars 3 forks source link

no method or default for coercing “SimpleGRangesList” to “CompressedGRangesList” #24

Closed biobenkj closed 4 years ago

biobenkj commented 4 years ago

Thanks so much for RaggedExperiment! They are truly amazing for multi-omic and genomics in general.

I'm encountering a new error that is only occurring in the Bioc-devel branch of Bioconductor packages when converting a GRangesList object to a RaggedExperiment.

A reproducible example can be shown using the existing Bioconductor release (3.9) and upcoming release (3.10 - currently devel).

Passing in 3.9

BiocManager::install("biobenkj/compartmap")
#should be version 1.65.7

library(compartmap)
library(minfi)

data(meth_array_450k_chr14)

getArrayABsignal(array.data.chr14, parallel=F, chr="chr14", bootstrap=F, genome="hg19", array.type="hm450")

Filtering to open sea CpG loci...
Converting to squeezed M-values.
Imputing missing values.
Dropping samples with >80% NAs.
Imputing missing data with kNN.
Cluster size 3384 broken into 2771 613 
Cluster size 2771 broken into 1052 1719 
Done cluster 1052 
Cluster size 1719 broken into 476 1243 
Done cluster 476 
Done cluster 1243 
Done cluster 1719 
Done cluster 2771 
Done cluster 613 
Working on naive.1
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on rTreg.2
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.3
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on naive.4
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.5
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_rTreg.6
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on naive.7
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on rTreg.8
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.9
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_rTreg.10
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on birth.11
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
> array_compartments
class: RaggedExperiment 
dim: 968 11 
assays(2): pc compartments
rownames: NULL
colnames(11): naive.1 rTreg.2 ... act_rTreg.10 birth.11
colData names(10): Sample_Name Sample_Well ... Basename filenames

The session info:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] minfi_1.30.0                             bumphunter_1.26.0                        locfit_1.5-9.1                          
 [4] iterators_1.0.12                         foreach_1.4.7                            compartmap_1.65.6                       
 [7] bsseq_1.20.0                             BiocSingular_1.0.0                       BSgenome.Mmusculus.UCSC.mm9_1.4.0       
[10] Mus.musculus_1.3.1                       TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.7 org.Mm.eg.db_3.8.2                      
[13] BSgenome.Hsapiens.UCSC.hg38_1.4.1        BSgenome_1.52.0                          rtracklayer_1.44.3                      
[16] Biostrings_2.52.0                        XVector_0.24.0                           Homo.sapiens_1.3.1                      
[19] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2  org.Hs.eg.db_3.8.2                       GO.db_3.8.2                             
[22] OrganismDbi_1.26.0                       GenomicFeatures_1.36.4                   AnnotationDbi_1.46.1                    
[25] RaggedExperiment_1.8.0                   SummarizedExperiment_1.14.1              DelayedArray_0.10.0                     
[28] BiocParallel_1.18.1                      matrixStats_0.54.0                       Biobase_2.44.0                          
[31] GenomicRanges_1.36.0                     GenomeInfoDb_1.20.0                      IRanges_2.18.2                          
[34] S4Vectors_0.22.0                         BiocGenerics_0.30.0                     

loaded via a namespace (and not attached):
  [1] backports_1.1.4          plyr_1.8.4               igraph_1.2.4.1           lazyeval_0.2.2           splines_3.6.1           
  [6] ggplot2_3.2.1            digest_0.6.20            viridis_0.5.1            magrittr_1.5             memoise_1.1.0           
 [11] limma_3.40.6             readr_1.3.1              annotate_1.62.0          R.utils_2.9.0            askpass_1.1             
 [16] siggenes_1.58.0          prettyunits_1.0.2        colorspace_1.4-1         blob_1.2.0               dplyr_0.8.3             
 [21] crayon_1.3.4             RCurl_1.95-4.12          graph_1.62.0             genefilter_1.66.0        GEOquery_2.52.0         
 [26] zeallot_0.1.0            impute_1.58.0            survival_2.44-1.1        glue_1.3.1               registry_0.5-1          
 [31] gtable_0.3.0             zlibbioc_1.30.0          Rhdf5lib_1.6.0           HDF5Array_1.12.2         scales_1.0.0            
 [36] DBI_1.0.0                rngtools_1.4             bibtex_0.4.2             Rcpp_1.0.2               viridisLite_0.3.0       
 [41] xtable_1.8-4             progress_1.2.2           bit_1.1-14               rsvd_1.0.2               mclust_5.4.5            
 [46] preprocessCore_1.46.0    httr_1.4.1               RColorBrewer_1.1-2       pkgconfig_2.0.2          reshape_0.8.8           
 [51] XML_3.98-1.20            R.methodsS3_1.7.1        tidyselect_0.2.5         rlang_0.4.0              munsell_0.5.0           
 [56] tools_3.6.1              RSQLite_2.1.2            stringr_1.4.0            bit64_0.9-7              beanplot_1.2            
 [61] scrime_1.3.5             purrr_0.3.2              RANN_2.6.1               nlme_3.1-141             pbapply_1.4-2           
 [66] RBGL_1.60.0              doRNG_1.7.1              nor1mix_1.3-0            R.oo_1.22.0              xml2_1.2.2              
 [71] biomaRt_2.40.4           compiler_3.6.1           rstudioapi_0.10          tibble_2.1.3             stringi_1.4.3           
 [76] lattice_0.20-38          Matrix_1.2-17            permute_0.9-5            multtest_2.40.0          vctrs_0.2.0             
 [81] pillar_1.4.2             BiocManager_1.30.4       data.table_1.12.2        bitops_1.0-6             irlba_2.3.3             
 [86] R6_2.4.0                 gridExtra_2.3            codetools_0.2-16         MASS_7.3-51.4            gtools_3.8.1            
 [91] assertthat_0.2.1         rhdf5_2.28.0             openssl_1.4.1            pkgmaker_0.27            withr_2.1.2             
 [96] GenomicAlignments_1.20.1 Rsamtools_2.0.0          GenomeInfoDbData_1.2.1   hms_0.5.1                quadprog_1.5-7          
[101] grid_3.6.1               tidyr_0.8.3              base64_2.0               DelayedMatrixStats_1.6.0 illuminaio_0.26.0

The failing version for 3.10

Filtering to open sea CpG loci...
Converting to squeezed M-values.
Imputing missing values.
Dropping samples with >80% NAs.
Imputing missing data with kNN.
Cluster size 3384 broken into 2771 613
Cluster size 2771 broken into 1052 1719
Done cluster 1052
Cluster size 1719 broken into 476 1243
Done cluster 476
Done cluster 1243
Done cluster 1719
Done cluster 2771
Done cluster 613
Working on naive.1
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on rTreg.2
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.3
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on naive.4
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.5
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_rTreg.6
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on naive.7
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on rTreg.8
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_naive.9
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on act_rTreg.10
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Working on birth.11
Computing compartments for chr14
Calculating correlations...
Done...
Calculating eigenvectors.
Smoothing eigenvector.
Done smoothing.
Error in as(objects[[1L]], "CompressedGRangesList", strict = FALSE) :
  no method or default for coercing “SimpleGRangesList” to “CompressedGRangesList”

The session info:

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] minfi_1.31.1
 [2] bumphunter_1.27.0
 [3] locfit_1.5-9.1
 [4] iterators_1.0.12
 [5] foreach_1.4.7
 [6] compartmap_1.65.7
 [7] bsseq_1.21.1
 [8] BiocSingular_1.1.7
 [9] BSgenome.Mmusculus.UCSC.mm9_1.4.0
[10] Mus.musculus_1.3.1
[11] TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.7
[12] org.Mm.eg.db_3.8.2
[13] BSgenome.Hsapiens.UCSC.hg38_1.4.1
[14] BSgenome_1.53.2
[15] rtracklayer_1.45.6
[16] Biostrings_2.53.2
[17] XVector_0.25.0
[18] Homo.sapiens_1.3.1
[19] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[20] org.Hs.eg.db_3.8.2
[21] GO.db_3.8.2
[22] OrganismDbi_1.27.1
[23] GenomicFeatures_1.37.4
[24] AnnotationDbi_1.47.1
[25] RaggedExperiment_1.9.1
[26] SummarizedExperiment_1.15.9
[27] DelayedArray_0.11.7
[28] BiocParallel_1.19.3
[29] matrixStats_0.55.0
[30] Biobase_2.45.1
[31] GenomicRanges_1.37.16
[32] GenomeInfoDb_1.21.2
[33] IRanges_2.19.16
[34] S4Vectors_0.23.25
[35] BiocGenerics_0.31.6

loaded via a namespace (and not attached):
  [1] backports_1.1.5          BiocFileCache_1.9.1      plyr_1.8.4
  [4] igraph_1.2.4.1           lazyeval_0.2.2           splines_3.6.1
  [7] ggplot2_3.2.1            digest_0.6.21            viridis_0.5.1
 [10] magrittr_1.5             memoise_1.1.0            limma_3.41.17
 [13] readr_1.3.1              annotate_1.63.0          R.utils_2.9.0
 [16] askpass_1.1              siggenes_1.59.0          prettyunits_1.0.2
 [19] colorspace_1.4-1         blob_1.2.0               rappdirs_0.3.1
 [22] dplyr_0.8.3              crayon_1.3.4             RCurl_1.95-4.12
 [25] graph_1.63.0             genefilter_1.67.1        GEOquery_2.53.0
 [28] zeallot_0.1.0            impute_1.59.0            survival_2.44-1.1
 [31] glue_1.3.1               registry_0.5-1           gtable_0.3.0
 [34] zlibbioc_1.31.0          Rhdf5lib_1.7.5           HDF5Array_1.13.9
 [37] scales_1.0.0             DBI_1.0.0                rngtools_1.4
 [40] bibtex_0.4.2             Rcpp_1.0.2               viridisLite_0.3.0
 [43] xtable_1.8-4             progress_1.2.2           bit_1.1-14
 [46] rsvd_1.0.2               mclust_5.4.5             preprocessCore_1.47.1
 [49] httr_1.4.1               RColorBrewer_1.1-2       pkgconfig_2.0.3
 [52] reshape_0.8.8            XML_3.98-1.20            R.methodsS3_1.7.1
 [55] dbplyr_1.4.2             tidyselect_0.2.5         rlang_0.4.0
 [58] munsell_0.5.0            tools_3.6.1              RSQLite_2.1.2
 [61] stringr_1.4.0            bit64_0.9-7              beanplot_1.2
 [64] scrime_1.3.5             purrr_0.3.2              RANN_2.6.1
 [67] nlme_3.1-141             pbapply_1.4-2            RBGL_1.61.0
 [70] doRNG_1.7.1              nor1mix_1.3-0            R.oo_1.22.0
 [73] xml2_1.2.2               biomaRt_2.41.9           compiler_3.6.1
 [76] curl_4.2                 tibble_2.1.3             stringi_1.4.3
 [79] lattice_0.20-38          Matrix_1.2-17            permute_0.9-5
 [82] multtest_2.41.0          vctrs_0.2.0              lifecycle_0.1.0
 [85] pillar_1.4.2             BiocManager_1.30.4       data.table_1.12.4
 [88] bitops_1.0-6             irlba_2.3.3              R6_2.4.0
 [91] gridExtra_2.3            codetools_0.2-16         MASS_7.3-51.4
 [94] gtools_3.8.1             assertthat_0.2.1         rhdf5_2.29.3
 [97] openssl_1.4.1            pkgmaker_0.27            withr_2.1.2
[100] GenomicAlignments_1.21.7 Rsamtools_2.1.6          GenomeInfoDbData_1.2.1
[103] hms_0.5.1                quadprog_1.5-7           grid_3.6.1
[106] tidyr_1.0.0              base64_2.0               DelayedMatrixStats_1.7.2
[109] illuminaio_0.27.1

The code in question here takes a list object that is output from lapply, converts it to a GRangesList object using:

as(my_list, "GRangesList")

and converts to a RaggedExperiment, where it fails (or passes).

Any idea what might have changed or is the issue?

Thanks so much!

LiNk-NY commented 4 years ago

Hi Ben, @biobenkj

I suspect that the issue may not stem from RaggedExperiment. Can you provide a minimally reproducible example? This would make it easier to see where the issue is coming from.

Best, Marcel

LiNk-NY commented 4 years ago

Any updates? @biobenkj

biobenkj commented 4 years ago

Yes, here is a minimal reproducible example using the RaggedExperiment example.

## Create a couple of GRanges objects with row ranges names
sample1 <- GRanges(
    c(a = "chr1:1-10:-", b = "chr1:11-18:+"),
    score = 1:2)
sample2 <- GRanges(
    c(c = "chr2:1-10:-", d = "chr2:11-18:+"),
    score = 3:4)

## Include column data
colDat <- DataFrame(id = 1:2)

## Make a GRangesList
grl <- as(list(sample1, sample2), "GRangesList")

## Convert to RaggedExperiment
## Works
re1 <- RaggedExperiment(sample1=sample1, sample2=sample2, colData = colDat)
re1
#class: RaggedExperiment
#dim: 4 2
#assays(1): score
#rownames(4): a b c d
#colnames(2): sample1 sample2
#colData names(1): id

## Breaks
re2 <- RaggedExperiment(grl, colData = colDat)
Error in as(objects[[1L]], "CompressedGRangesList", strict = FALSE) :
  no method or default for coercing “SimpleGRangesList” to “CompressedGRangesList”

Session info

> sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 18.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.7.1
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.7.1

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats4    stats     graphics  grDevices utils     datasets
[8] methods   base

other attached packages:
 [1] compartmap_1.65.7
 [2] bsseq_1.21.1
 [3] BiocSingular_1.1.7
 [4] BSgenome.Mmusculus.UCSC.mm9_1.4.0
 [5] Mus.musculus_1.3.1
 [6] TxDb.Mmusculus.UCSC.mm10.knownGene_3.4.7
 [7] org.Mm.eg.db_3.8.2
 [8] BSgenome.Hsapiens.UCSC.hg38_1.4.1
 [9] BSgenome_1.53.2
[10] rtracklayer_1.45.6
[11] Biostrings_2.53.2
[12] XVector_0.25.0
[13] Homo.sapiens_1.3.1
[14] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[15] org.Hs.eg.db_3.8.2
[16] GO.db_3.8.2
[17] OrganismDbi_1.27.1
[18] GenomicFeatures_1.37.4
[19] AnnotationDbi_1.47.1
[20] RaggedExperiment_1.9.1
[21] SummarizedExperiment_1.15.9
[22] DelayedArray_0.11.7
[23] BiocParallel_1.19.3
[24] matrixStats_0.55.0
[25] Biobase_2.45.1
[26] GenomicRanges_1.37.16
[27] GenomeInfoDb_1.21.2
[28] IRanges_2.19.16
[29] S4Vectors_0.23.25
[30] BiocGenerics_0.31.6

loaded via a namespace (and not attached):
 [1] bitops_1.0-6             bit64_0.9-7              RColorBrewer_1.1-2
 [4] progress_1.2.2           httr_1.4.1               tools_3.6.1
 [7] backports_1.1.5          R6_2.4.0                 irlba_2.3.3
[10] HDF5Array_1.13.9         lazyeval_0.2.2           DBI_1.0.0
[13] colorspace_1.4-1         permute_0.9-5            gridExtra_2.3
[16] tidyselect_0.2.5         prettyunits_1.0.2        bit_1.1-14
[19] curl_4.2                 compiler_3.6.1           graph_1.63.0
[22] scales_1.0.0             pbapply_1.4-2            RBGL_1.61.0
[25] askpass_1.1              rappdirs_0.3.1           stringr_1.4.0
[28] digest_0.6.21            Rsamtools_2.1.6          R.utils_2.9.0
[31] pkgconfig_2.0.3          dbplyr_1.4.2             limma_3.41.17
[34] rlang_0.4.0              impute_1.59.0            RSQLite_2.1.2
[37] DelayedMatrixStats_1.7.2 gtools_3.8.1             R.oo_1.22.0
[40] dplyr_0.8.3              RCurl_1.95-4.12          magrittr_1.5
[43] GenomeInfoDbData_1.2.1   Matrix_1.2-17            Rcpp_1.0.2
[46] munsell_0.5.0            Rhdf5lib_1.7.5           viridis_0.5.1
[49] R.methodsS3_1.7.1        stringi_1.4.3            zlibbioc_1.31.0
[52] rhdf5_2.29.3             BiocFileCache_1.9.1      grid_3.6.1
[55] blob_1.2.0               crayon_1.3.4             lattice_0.20-38
[58] hms_0.5.1                locfit_1.5-9.1           zeallot_0.1.0
[61] pillar_1.4.2             igraph_1.2.4.1           biomaRt_2.41.9
[64] XML_3.98-1.20            glue_1.3.1               data.table_1.12.4
[67] BiocManager_1.30.4       vctrs_0.2.0              gtable_0.3.0
[70] RANN_2.6.1               openssl_1.4.1            purrr_0.3.2
[73] assertthat_0.2.1         ggplot2_3.2.1            rsvd_1.0.2
[76] viridisLite_0.3.0        tibble_2.1.3             GenomicAlignments_1.21.7
[79] memoise_1.1.0
biobenkj commented 4 years ago

The above example works for both re1 and re2 using Bioc 3.9

LiNk-NY commented 4 years ago

Thanks Ben! @biobenkj I've filed an issue in GenomicRanges. I will wait for Hervé's response. Please see the issue linked above for more details. Thanks.

biobenkj commented 4 years ago

Awesome! Thanks so much @LiNk-NY!

biobenkj commented 4 years ago

Seems like Hervé made the fix. I've gone through and checked with the patched version of GRanges and explicitly setting as(list(sample1, sample2), "CompressedGRangesList") works. I had no idea there was such a thing as a "CompressedGRangesList" object. Thanks again for all your help! Feel free to close.

hpages commented 4 years ago

@biobenkj Note that you should not be required to know about the existence of CompressedGRangesList and doing as(list(sample1, sample2), "GRangesList") should just work. It was not, but with GenomicRanges 1.37.17 now it should. Please let me know if you still run into problems with this new version of GenomicRanges. Thanks!

LiNk-NY commented 4 years ago

Hi Ben, @biobenkj I've made the constructor function more robust to list inputs. d8b1ae5309f2afa53dd36935a344f850905fcc5e

Thank you for reporting and providing a reproducible example. Best, Marcel