Closed rrydbirk closed 5 years ago
At what stage does it get stuck? Could you make this con$graph object available for us to test on? Thanks, -peter.
On Apr 16, 2019, at 05:20, rrydbirk notifications@github.com wrote:
This takes 20+ min to run.
Code:
con$embedGraph(method="UMAP")
Object:
lapply(con$samples,function(x) str(x$counts)) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:14427418] 13 58 96 117 119 134 141 148 180 198 ... ..@ p : int [1:18657] 0 317 1322 1363 1686 1717 1749 1804 2204 3250 ... ..@ Dim : int [1:2] 6255 18656 ..@ Dimnames:List of 2 .. ..$ : chr [1:6255] "S1_AAACCCAAGATCGGTG" "S1_AAACCCAAGGCAGGGA" "S1_AAACCCACAAATAGCA" "S1_AAACCCACATGGAATA" ... .. ..$ : chr [1:18656] "AL627309.1" "AL669831.5" "LINC00115" "NOC2L" ... ..@ x : Named num [1:14427418] 0.0471 0.1322 0.0632 0.0369 0.0526 ... .. ..- attr(, "names")= chr [1:14427418] "S1_AAACGAAGTGTGGTCC" "S1_AAAGTCCTCGGCCAAC" "S1_AACAGGGAGACATATG" "S1_AACCATGCAGAGAAAG" ... ..@ factors : list() Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:16225403] 5 28 63 75 99 118 130 142 175 220 ... ..@ p : int [1:18205] 0 389 1510 1589 1653 2184 2259 2366 3172 4162 ... ..@ Dim : int [1:2] 6470 18204 ..@ Dimnames:List of 2 .. ..$ : chr [1:6470] "S2_AAACCCAAGATGGCAC" "S2_AAACCCAAGTGCGCTC" "S2_AAACCCACAACCAGAG" "ctrl_039_AAACCCACACTGCGTG" ... .. ..$ : chr [1:18204] "AL627309.1" "AL669831.5" "LINC00115" "SAMD11" ... ..@ x : Named num [1:16225403] 0.2572 0.0987 0.0631 0.1405 0.0882 ... .. ..- attr(, "names")= chr [1:16225403] "S2_AAACCCAGTGAAGCTG" "S2_AAAGAACGTTTGATCG" "S2_AAAGTCCTCGCCATAA" "S2_AAATGGAGTATACGGG" ... ..@ factors : list() Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:9283290] 16 18 23 26 50 62 65 75 100 109 ... ..@ p : int [1:17626] 0 271 313 1093 1169 1234 1549 1603 1698 1797 ... ..@ Dim : int [1:2] 3500 17625 ..@ Dimnames:List of 2 .. ..$ : chr [1:3500] "S3_AAACCCACACTACACA" "S3_AAACCCAGTACTAAGA" "S3_AAACCCATCGTTTACT" "S3_AAACCCATCTTGGAAC" ... .. ..$ : chr [1:17625] "AL627309.1" "AC114498.1" "AL669831.5" "LINC00115" ... ..@ x : Named num [1:9283290] 0.1539 0.039 0.0366 0.2743 0.0311 ... .. ..- attr(*, "names")= chr [1:9283290] "S3_AAAGGATCACGCAAAG" "S3_AAAGGATGTAGCTCGC" "S3_AAAGGGCTCGGTTGTA" "S3_AAAGGTACACGCACCA" ... ..@ factors : list() $S1 NULL
$S2 NULL
$S3 NULL
sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS/LAPACK: /cm/shared/apps/intel/parallel_studio_xe/2018_update2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
locale: [1] C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] conos_0.0.0.9002 igraph_1.2.4 Matrix_1.2-15 RLinuxModules_0.2
loaded via a namespace (and not attached): [1] mclust_5.4.3 Rcpp_1.0.1 mvtnorm_1.0-10 [4] lattice_0.20-38 GO.db_3.7.0 class_7.3-14 [7] assertthat_0.2.1 digest_0.6.18 mime_0.6 [10] R6_2.4.0 plyr_1.8.4 stats4_3.5.0 [13] RSQLite_2.1.1 ggplot2_3.1.1 pillar_1.3.1 [16] rlang_0.3.4 lazyeval_0.2.2 diptest_0.75-7 [19] irlba_2.3.3 whisker_0.3-2 blob_1.1.1 [22] kernlab_0.9-27 S4Vectors_0.20.1 urltools_1.7.2 [25] triebeard_0.3.0 bit_1.1-14 munsell_0.5.0 [28] shiny_1.3.0 compiler_3.5.0 httpuv_1.5.1 [31] pkgconfig_2.0.2 BiocGenerics_0.28.0 base64enc_0.1-3 [34] pcaMethods_1.74.0 htmltools_0.3.6 nnet_7.3-12 [37] tidyselect_0.2.5 tibble_2.1.1 gridExtra_2.3 [40] pagoda2_0.0.0.9002 IRanges_2.16.0 dendextend_1.10.0 [43] viridisLite_0.3.0 crayon_1.3.4 dplyr_0.8.0.1 [46] later_0.8.0 MASS_7.3-51.1 grid_3.5.0 [49] xtable_1.8-3 gtable_0.3.0 DBI_1.0.0 [52] magrittr_1.5 scales_1.0.0 dendsort_0.3.3 [55] viridis_0.5.1 promises_1.0.1 flexmix_2.3-15 [58] robustbase_0.93-4 brew_1.0-6 rjson_0.2.20 [61] tools_3.5.0 fpc_2.1-11.1 bit64_0.9-7 [64] Biobase_2.42.0 glue_1.3.1 trimcluster_0.1-2.1 [67] DEoptimR_1.0-8 purrr_0.3.2 Rook_1.1-1 [70] parallel_3.5.0 AnnotationDbi_1.44.0 colorspace_1.4-1 [73] cluster_2.0.7-1 prabclus_2.2-7 memoise_1.1.0 [76] modeltools_0.2-22
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.
con$embedGraph(method="UMAP")) Convert graph to adjacency list... Done Estimate nearest neighbors and commute times... Estimating hitting distances: 14:30:47. 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| (this step easily takes 20+ min irrelevant of n.cores)
Please let me know how I should forward my conos object to you.
At what stage does it get stuck? Could you make this con$graph object available for us to test on? Thanks, -peter.
Probably the easiest way to shore is to do saveRDS(con$graph,file=‘graph.rds’) and share it on dropbox/google drive somewhere. Shouldn’t be too large. Thanks, -peter.
On Apr 16, 2019, at 8:35 AM, Rasmus Rydbirk notifications@github.com wrote:
con$embedGraph(method="UMAP")) Convert graph to adjacency list... Done Estimate nearest neighbors and commute times... Estimating hitting distances: 14:30:47. 0% 10 20 30 40 50 60 70 80 90 100% [----|----|----|----|----|----|----|----|----|----| (this step easily takes 20+ min irrelevant of n.cores)
Please let me know how I should forward my conos object to you.
At what stage does it get stuck? Could you make this con$graph object available for us to test on? Thanks, -peter.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hms-dbmi/conos/issues/22#issuecomment-483640633, or mute the thread https://github.com/notifications/unsubscribe-auth/ALT78oVE9-iIByMnzizLxKvjMFJGF2FCks5vhcOQgaJpZM4cxwhc.
I'm surprised that it works that long for 3 datasets few thousands cells each. Moreover, number of core should be relevant here. But in general, for datasets of about hundred of thousands, twenty minutes is completely fine even for large number of cores (~30).
Sorry for not getting back to you before. The graph object can be downloaded here: https://www.dropbox.com/s/5djfr90twdmkzf6/graph.rds?dl=1
Indeed, the number of cores scales processing time linearly, however, based on the tutorial where you used 12k cells and 4 cores and it took ~1 min to run, versus my example with ~16k cells, I'm surprised it takes 26 min with 10 cores.
Problem was caused by faulty installation of parallel package.
This takes 20+ min to run.
Code:
con$embedGraph(method="UMAP")
Object:
lapply(con$samples,function(x) str(x$counts)) Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:14427418] 13 58 96 117 119 134 141 148 180 198 ... ..@ p : int [1:18657] 0 317 1322 1363 1686 1717 1749 1804 2204 3250 ... ..@ Dim : int [1:2] 6255 18656 ..@ Dimnames:List of 2 .. ..$ : chr [1:6255] "S1_AAACCCAAGATCGGTG" "S1_AAACCCAAGGCAGGGA" "S1_AAACCCACAAATAGCA" "S1_AAACCCACATGGAATA" ... .. ..$ : chr [1:18656] "AL627309.1" "AL669831.5" "LINC00115" "NOC2L" ... ..@ x : Named num [1:14427418] 0.0471 0.1322 0.0632 0.0369 0.0526 ... .. ..- attr(, "names")= chr [1:14427418] "S1_AAACGAAGTGTGGTCC" "S1_AAAGTCCTCGGCCAAC" "S1_AACAGGGAGACATATG" "S1_AACCATGCAGAGAAAG" ... ..@ factors : list() Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:16225403] 5 28 63 75 99 118 130 142 175 220 ... ..@ p : int [1:18205] 0 389 1510 1589 1653 2184 2259 2366 3172 4162 ... ..@ Dim : int [1:2] 6470 18204 ..@ Dimnames:List of 2 .. ..$ : chr [1:6470] "S2_AAACCCAAGATGGCAC" "S2_AAACCCAAGTGCGCTC" "S2_AAACCCACAACCAGAG" "ctrl_039_AAACCCACACTGCGTG" ... .. ..$ : chr [1:18204] "AL627309.1" "AL669831.5" "LINC00115" "SAMD11" ... ..@ x : Named num [1:16225403] 0.2572 0.0987 0.0631 0.1405 0.0882 ... .. ..- attr(, "names")= chr [1:16225403] "S2_AAACCCAGTGAAGCTG" "S2_AAAGAACGTTTGATCG" "S2_AAAGTCCTCGCCATAA" "S2_AAATGGAGTATACGGG" ... ..@ factors : list() Formal class 'dgCMatrix' [package "Matrix"] with 6 slots ..@ i : int [1:9283290] 16 18 23 26 50 62 65 75 100 109 ... ..@ p : int [1:17626] 0 271 313 1093 1169 1234 1549 1603 1698 1797 ... ..@ Dim : int [1:2] 3500 17625 ..@ Dimnames:List of 2 .. ..$ : chr [1:3500] "S3_AAACCCACACTACACA" "S3_AAACCCAGTACTAAGA" "S3_AAACCCATCGTTTACT" "S3_AAACCCATCTTGGAAC" ... .. ..$ : chr [1:17625] "AL627309.1" "AC114498.1" "AL669831.5" "LINC00115" ... ..@ x : Named num [1:9283290] 0.1539 0.039 0.0366 0.2743 0.0311 ... .. ..- attr(*, "names")= chr [1:9283290] "S3_AAAGGATCACGCAAAG" "S3_AAAGGATGTAGCTCGC" "S3_AAAGGGCTCGGTTGTA" "S3_AAAGGTACACGCACCA" ... ..@ factors : list() $S1 NULL
$S2 NULL
$S3 NULL
sessionInfo() R version 3.5.0 (2018-04-23) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)
Matrix products: default BLAS/LAPACK: /cm/shared/apps/intel/parallel_studio_xe/2018_update2/compilers_and_libraries_2018.2.199/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so
locale: [1] C
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] conos_0.0.0.9002 igraph_1.2.4 Matrix_1.2-15 RLinuxModules_0.2
loaded via a namespace (and not attached): [1] mclust_5.4.3 Rcpp_1.0.1 mvtnorm_1.0-10 [4] lattice_0.20-38 GO.db_3.7.0 class_7.3-14 [7] assertthat_0.2.1 digest_0.6.18 mime_0.6 [10] R6_2.4.0 plyr_1.8.4 stats4_3.5.0 [13] RSQLite_2.1.1 ggplot2_3.1.1 pillar_1.3.1 [16] rlang_0.3.4 lazyeval_0.2.2 diptest_0.75-7 [19] irlba_2.3.3 whisker_0.3-2 blob_1.1.1 [22] kernlab_0.9-27 S4Vectors_0.20.1 urltools_1.7.2 [25] triebeard_0.3.0 bit_1.1-14 munsell_0.5.0 [28] shiny_1.3.0 compiler_3.5.0 httpuv_1.5.1 [31] pkgconfig_2.0.2 BiocGenerics_0.28.0 base64enc_0.1-3 [34] pcaMethods_1.74.0 htmltools_0.3.6 nnet_7.3-12 [37] tidyselect_0.2.5 tibble_2.1.1 gridExtra_2.3 [40] pagoda2_0.0.0.9002 IRanges_2.16.0 dendextend_1.10.0 [43] viridisLite_0.3.0 crayon_1.3.4 dplyr_0.8.0.1 [46] later_0.8.0 MASS_7.3-51.1 grid_3.5.0 [49] xtable_1.8-3 gtable_0.3.0 DBI_1.0.0 [52] magrittr_1.5 scales_1.0.0 dendsort_0.3.3 [55] viridis_0.5.1 promises_1.0.1 flexmix_2.3-15 [58] robustbase_0.93-4 brew_1.0-6 rjson_0.2.20 [61] tools_3.5.0 fpc_2.1-11.1 bit64_0.9-7 [64] Biobase_2.42.0 glue_1.3.1 trimcluster_0.1-2.1 [67] DEoptimR_1.0-8 purrr_0.3.2 Rook_1.1-1 [70] parallel_3.5.0 AnnotationDbi_1.44.0 colorspace_1.4-1 [73] cluster_2.0.7-1 prabclus_2.2-7 memoise_1.1.0 [76] modeltools_0.2-22