NathanSkene / EWCE

Expression Weighted Celltype Enrichment. See the package website for up-to-date instructions on usage.
https://nathanskene.github.io/EWCE/index.html
53 stars 25 forks source link

`standardise_ctd`: matrices not getting converted to sparse #73

Closed bschilder closed 2 years ago

bschilder commented 2 years ago

1. Bug description

Only the specificity quantiles matrices are getting converted to sparse format, making CTDs larger than they need to be.

Expected behaviour

Make all matrices in CTD sparse.

2. Reproducible example

Code

ctd2 <- EWCE::standardise_ctd(ctd)
ctd <- ewceData::ctd()
EWCE:::is_sparse_matrix(ctd2[[1]]$mean_exp) # FALSE
EWCE:::is_sparse_matrix(ctd2[[1]]$specificity) # FALSE
EWCE:::is_sparse_matrix(ctd2[[1]]$specificity_quantiles) # TRUE

3. Session info

(Add output of the R function utils::sessionInfo() below. This helps us assess version/OS conflicts which could be causing bugs.)

``` R Under development (unstable) (2022-02-25 r81808) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS Matrix products: default BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.8.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ewceData_1.5.0 ExperimentHub_2.5.0 AnnotationHub_3.5.0 BiocFileCache_2.5.0 dbplyr_2.2.1 [6] BiocGenerics_0.43.0 sp_1.5-0 SeuratObject_4.1.0 Seurat_4.1.1 echoconda_0.99.6 [11] scKirby_0.1.0 EWCE_1.5.3 RNOmni_1.0.0 dplyr_1.0.9 loaded via a namespace (and not attached): [1] pbapply_1.5-0 lattice_0.20-45 vctrs_0.4.1 [4] expm_0.999-6 fastICA_1.2-3 usethis_2.1.6 [7] mgcv_1.8-40 blob_1.2.3 survival_3.3-1 [10] spatstat.data_2.2-0 later_1.3.0 nloptr_2.0.3 [13] DBI_1.1.3 R.utils_2.12.0 SingleCellExperiment_1.19.0 [16] rappdirs_0.3.3 uwot_0.1.11 zlibbioc_1.43.0 [19] rgeos_0.5-9 htmlwidgets_1.5.4 mvtnorm_1.1-3 [22] GlobalOptions_0.1.2 future_1.26.1 leiden_0.4.2 [25] parallel_4.2.0 irlba_2.3.5 Rcpp_1.0.9 [28] readr_2.1.2 KernSmooth_2.23-20 promises_1.2.0.1 [31] gdata_2.18.0.1 DDRTree_0.1.5 DelayedArray_0.23.0 [34] limma_3.53.4 pkgload_1.3.0 clusterGeneration_1.3.7 [37] fs_1.5.2 googleAuthR_2.0.0 fastmatch_1.1-3 [40] mnormt_2.1.0 basilisk_1.9.2 digest_0.6.29 [43] png_0.1-7 qlcMatrix_0.9.7 sctransform_0.3.3 [46] cowplot_1.1.1 here_1.0.1 pkgconfig_2.0.3 [49] docopt_0.7.1 spatstat.random_2.2-0 iterators_1.0.14 [52] minqa_1.2.4 reticulate_1.25 SummarizedExperiment_1.27.1 [55] circlize_0.4.15 GetoptLong_1.0.5 xfun_0.31 [58] zoo_1.8-10 tidyselect_1.1.2 reshape2_1.4.4 [61] purrr_0.3.4 ica_1.0-3 gprofiler2_0.2.1 [64] viridisLite_0.4.0 rtracklayer_1.57.0 pkgbuild_1.3.1 [67] rlang_1.0.4 glue_1.6.2 RColorBrewer_1.1-3 [70] orthogene_1.3.1 pals_1.7 registry_0.5-1 [73] matrixStats_0.62.0 MatrixGenerics_1.9.1 stringr_1.4.0 [76] ggsignif_0.6.3 labeling_0.4.2 httpuv_1.6.5 [79] class_7.3-20 webshot_0.5.3 jsonlite_1.8.0 [82] XVector_0.37.0 sceasy_0.0.6 bit_4.0.4 [85] mime_0.12 gridExtra_2.3 gplots_3.1.3 [88] Rsamtools_2.13.3 Exact_3.1 stringi_1.7.8 [91] processx_3.7.0 spatstat.sparse_2.1-1 scattermore_0.8 [94] yulab.utils_0.0.5 quadprog_1.5-8 bitops_1.0-7 [97] cli_3.3.0 rhdf5filters_1.9.0 maps_3.4.0 [100] RSQLite_2.2.15 tidyr_1.2.0 heatmaply_1.3.0 [103] pheatmap_1.0.12 homologene_1.4.68.19.3.27 data.table_1.14.2 [106] HGNChelper_0.8.1 rstudioapi_0.13 TSP_1.2-1 [109] GenomicAlignments_1.33.0 nlme_3.1-158 phangorn_2.9.0 [112] VariantAnnotation_1.43.2 listenv_0.8.0 miniUI_0.1.1.1 [115] gridGraphics_0.5-1 leidenbase_0.1.11 R.oo_1.25.0 [118] urlchecker_1.0.1 sessioninfo_1.2.2 readxl_1.4.0 [121] lifecycle_1.0.1 munsell_0.5.0 cellranger_1.1.0 [124] R.methodsS3_1.8.2 mapproj_1.2.8 caTools_1.18.2 [127] codetools_0.2-18 coda_0.19-4 Biobase_2.57.1 [130] GenomeInfoDb_1.33.3 lmtest_0.9-40 ontologyIndex_2.7 [133] xtable_1.8-4 ROCR_1.0-11 BiocManager_1.30.18 [136] scatterplot3d_0.3-41 abind_1.4-5 farver_2.1.1 [139] parallelly_1.32.1 RANN_2.6.1 aplot_0.1.6 [142] sparsesvd_0.2 ggtree_3.5.1 GenomicRanges_1.49.0 [145] BiocIO_1.7.1 GEOquery_2.65.2 RcppAnnoy_0.0.19 [148] goftest_1.2-3 patchwork_1.1.1 tibble_3.1.7 [151] ggdendro_0.1.23 profvis_0.3.7 dichromat_2.0-0.1 [154] cluster_2.1.3 future.apply_1.9.0 dendextend_1.16.0 [157] GeneOverlap_1.33.0 Matrix_1.4-1 tidytree_0.3.9 [160] ellipsis_0.3.2 prettyunits_1.1.1 ggridges_0.5.3 [163] igraph_1.3.4 remotes_2.4.2 downloadR_0.99.3 [166] slam_0.1-50 gargle_1.2.0 basilisk.utils_1.9.1 [169] phytools_1.0-3 spatstat.utils_2.3-1 htmltools_0.5.3 [172] piggyback_0.1.4 yaml_2.3.5 GenomicFeatures_1.49.5 [175] utf8_1.2.2 plotly_4.10.0 interactiveDisplayBase_1.35.0 [178] XML_3.99-0.10 e1071_1.7-11 ggpubr_0.4.0 [181] fitdistrplus_1.1-8 BiocParallel_1.31.10 bit64_4.0.5 [184] rootSolve_1.8.2.3 foreach_1.5.2 Biostrings_2.65.1 [187] spatstat.core_2.4-4 combinat_0.0-8 progressr_0.10.1 [190] MAGMA.Celltyping_2.0.4 devtools_2.4.4 evaluate_0.15 [193] memoise_2.0.1 VGAM_1.1-7 tzdb_0.3.0 [196] callr_3.7.1 lmom_2.9 ps_1.7.1 [199] curl_4.3.2 fansi_1.0.3 tensor_1.5 [202] cachem_1.0.6 deldir_1.0-6 babelgene_22.3 [205] dir.expiry_1.5.0 ggplot2_3.3.6 rjson_0.2.21 [208] rstatix_0.7.0 ggrepel_0.9.1 clue_0.3-61 [211] rprojroot_2.0.3 tools_4.2.0 magrittr_2.0.3 [214] RCurl_1.98-1.7 proxy_0.4-27 car_3.1-0 [217] ape_5.6-2 ggplotify_0.1.0 xml2_1.3.3 [220] httr_1.4.3 assertthat_0.2.1 rmarkdown_2.14 [223] boot_1.3-28 globals_0.15.1 R6_2.5.1 [226] Rhdf5lib_1.19.2 progress_1.2.2 KEGGREST_1.37.3 [229] treeio_1.21.0 gtools_3.9.3 shape_1.4.6 [232] corrplot_0.92 BiocVersion_3.16.0 HDF5Array_1.25.1 [235] rhdf5_2.41.1 splines_4.2.0 carData_3.0-5 [238] ggfun_0.0.6 colorspace_2.0-3 generics_0.1.3 [241] stats4_4.2.0 pillar_1.8.0 anndata_0.7.5.3 [244] HSMMSingleCell_1.17.0 GenomeInfoDbData_1.2.8 plyr_1.8.7 [247] gtable_0.3.0 monocle_2.25.1 restfulr_0.0.15 [250] knitr_1.39 ComplexHeatmap_2.13.0 biomaRt_2.53.2 [253] IRanges_2.31.0 fastmap_1.1.0 seriation_1.3.6 [256] doParallel_1.0.17 AnnotationDbi_1.59.1 broom_1.0.0 [259] BSgenome_1.65.2 scales_1.2.0 filelock_1.0.2 [262] backports_1.4.1 plotrix_3.8-2 S4Vectors_0.35.1 [265] lme4_1.1-30 gld_2.6.5 hms_1.1.1 [268] Rtsne_0.16 shiny_1.7.2 MungeSumstats_1.5.5 [271] polyclip_1.10-0 grid_4.2.0 numDeriv_2016.8-1.1 [274] DescTools_0.99.45 lazyeval_0.2.2 crayon_1.5.1 [277] MASS_7.3-58 viridis_0.6.2 rpart_4.1.16 [280] compiler_4.2.0 spatstat.geom_2.4-0 ```
bschilder commented 2 years ago

Also, make standardise_ctd more generalizable to all matrices stored in CTD, not just those I've hard-coded into the function.