farrellja / URD

URD - Reconstruction of Branching Developmental Trajectories
GNU General Public License v3.0
117 stars 41 forks source link

Error from force-directed layout generation #25

Closed sobjb7osx2 closed 5 years ago

sobjb7osx2 commented 5 years ago

Hi,

I went through the quick start tutorial and at the force-directed layout generation, I came across the following error:

Error in RANN::nn2(object@tree$walks.force.layout[, c("x", "y", "telescope.pt")], : NA/NaN/Inf in foreign function call (arg 1)

The following is the verbose message I received when "treeForceDirectedLayout" was run:

[1] "2018-10-11 11:07:27 : Starting with parameters fr 120 NN 2 D 20387 cells" [1] "Removing 0 cells that are not assigned a pseudotime or a segment in the tree." [1] "2018-10-11 11:07:27: Preparing walk data." [1] "2018-10-11 11:07:27: Calculating nearest neighbor graph." [1] "2018-10-11 11:07:32: Preparing edge list." [1] "2018-10-11 11:07:42: Removing 6.61% of edges that are between segments with distance > 2" [1] "2018-10-11 11:07:43: Trimming cells that are no longer well connected." [1] "2018-10-11 11:07:45: 99.93% of starting cells preserved." [1] "2018-10-11 11:07:45: Preparing igraph object." [1] "2018-10-11 11:07:46: Doing force-directed layout." [1] "2018-10-11 11:13:09: Calculating Z." [1] "2018-10-11 11:13:25: Calculating local density."

I would appreciate any input in overcoming this error. Thanks!

ksr2018 commented 5 years ago

Hi, I have the same error and I'm stuck here. Does anyone know why this is happening?

`> combined.tree <- treeForceDirectedLayout(combined.tree, num.nn = 130, method = "fr", cut.unconnected.segments = 2, min.final.neighbors = 4, verbose = T) [1] "2019-01-29 02:56:57 : Starting with parameters fr 130 NN 2 D 27837 cells" [1] "Removing 0 cells that are not assigned a pseudotime or a segment in the tree." [1] "2019-01-29 02:56:57: Preparing walk data." [1] "2019-01-29 02:56:57: Calculating nearest neighbor graph." [1] "2019-01-29 02:57:06: Preparing edge list." [1] "2019-01-29 02:57:22: Removing 3.72% of edges that are between segments with distance > 2" [1] "2019-01-29 02:57:22: Trimming cells that are no longer well connected." [1] "2019-01-29 02:57:25: 99.75% of starting cells preserved." [1] "2019-01-29 02:57:25: Preparing igraph object." [1] "2019-01-29 02:57:27: Doing force-directed layout." [1] "2019-01-29 03:01:41: Calculating Z." [1] "2019-01-29 03:02:07: Calculating local density." Error in RANN::nn2(object@tree$walks.force.layout[, c("x", "y", "telescope.pt")], : NA/NaN/Inf in foreign function call (arg 1)

sessionInfo() R version 3.5.2 (2018-12-20) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.5 LTS

Matrix products: default BLAS: /usr/lib/libblas/libblas.so.3.6.0 LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] URD_1.0.2 Seurat_2.3.4 Matrix_1.2-15 cowplot_0.9.4 ggplot2_3.1.0

loaded via a namespace (and not attached): [1] readxl_1.2.0 snow_0.4-3 backports_1.1.3 Hmisc_4.1-1 RcppEigen_0.3.3.5.0
[6] plyr_1.8.4 igraph_1.2.2 lazyeval_0.2.1 sp_1.3-1 splines_3.5.2
[11] BiocParallel_1.14.2 GenomeInfoDb_1.16.0 digest_0.6.18 foreach_1.4.4 htmltools_0.3.6
[16] viridis_0.5.1 lars_1.2 gdata_2.18.0 magrittr_1.5 checkmate_1.9.1
[21] cluster_2.0.7-1 mixtools_1.1.0 ROCR_1.0-7 openxlsx_4.1.0 gmodels_2.18.1
[26] matrixStats_0.54.0 R.utils_2.7.0 xts_0.11-2 colorspace_1.4-0 ggrepel_0.8.0
[31] haven_2.0.0 xfun_0.4 dplyr_0.7.8 crayon_1.3.4 RCurl_1.95-4.11
[36] jsonlite_1.6 bindr_0.1.1 survival_2.43-3 zoo_1.8-4 iterators_1.0.10
[41] ape_5.2 glue_1.3.0 gtable_0.2.0 zlibbioc_1.26.0 XVector_0.20.0
[46] DelayedArray_0.6.6 car_3.0-2 kernlab_0.9-27 prabclus_2.2-7 BiocGenerics_0.26.0
[51] DEoptimR_1.0-8 abind_1.4-5 VIM_4.7.0 scales_1.0.0 mvtnorm_1.0-8
[56] ggthemes_4.0.1 bibtex_0.4.2 Rcpp_1.0.0 metap_1.0 dtw_1.20-1
[61] viridisLite_0.3.0 laeken_0.5.0 htmlTable_1.13.1 units_0.6-2 reticulate_1.10
[66] foreign_0.8-71 bit_1.1-14 proxy_0.4-22 mclust_5.4.2 SDMTools_1.1-221
[71] Formula_1.2-3 stats4_3.5.2 tsne_0.1-3 vcd_1.4-4 htmlwidgets_1.3
[76] httr_1.4.0 gplots_3.0.1 RColorBrewer_1.1-2 fpc_2.1-11.1 acepack_1.4.1
[81] modeltools_0.2-22 ica_1.0-2 farver_1.1.0 pkgconfig_2.0.2 R.methodsS3_1.7.1
[86] flexmix_2.3-14 nnet_7.3-12 labeling_0.3 tidyselect_0.2.5 rlang_0.3.1
[91] reshape2_1.4.3 munsell_0.5.0 cellranger_1.1.0 tools_3.5.2 ggridges_0.5.1
[96] stringr_1.3.1 npsurv_0.4-0 knitr_1.21 bit64_0.9-7 fitdistrplus_1.0-11
[101] zip_1.0.0 robustbase_0.93-3 caTools_1.17.1.1 purrr_0.2.5 RANN_2.6.1
[106] ggraph_1.0.2 bindrcpp_0.2.2 pbapply_1.3-4 nlme_3.1-137 R.oo_1.22.0
[111] hdf5r_1.0.1 compiler_3.5.2 rstudioapi_0.9.0 curl_3.3 png_0.1-7
[116] e1071_1.7-0.1 lsei_1.2-0 tweenr_1.0.1 smoother_1.1 tibble_2.0.1
[121] stringi_1.2.4 forcats_0.3.0 lattice_0.20-38 trimcluster_0.1-2.1 pillar_1.3.1
[126] Rdpack_0.10-1 lmtest_0.9-36 data.table_1.12.0 bitops_1.0-6 irlba_2.3.2
[131] gbRd_0.4-11 GenomicRanges_1.32.7 R6_2.3.0 latticeExtra_0.6-28 KernSmooth_2.23-15
[136] gridExtra_2.3 rio_0.5.16 IRanges_2.14.12 codetools_0.2-16 boot_1.3-20
[141] MASS_7.3-51.1 gtools_3.8.1 assertthat_0.2.0 destiny_2.10.2 SummarizedExperiment_1.10.1 [146] minpack.lm_1.2-1 withr_2.1.2 S4Vectors_0.18.3 GenomeInfoDbData_1.1.0 diptest_0.75-7
[151] parallel_3.5.2 doSNOW_1.0.16 hms_0.4.2 grid_3.5.2 rpart_4.1-13
[156] tidyr_0.8.2 class_7.3-15 carData_3.0-2 segmented_0.5-3.0 Rtsne_0.15
[161] TTR_0.23-4 ggforce_0.1.3 scatterplot3d_0.3-41 Biobase_2.40.0 base64enc_0.1-3 `

decarlin commented 5 years ago

Hmm, the initial thing I would check would be that object@tree$walks.force.layout[, c("x", "y", "telescope.pt")] where "object" is replaced by the name of your URD object and look for NaN's. Perhaps there was an earlier step to calculate the tree? have you successfully run plotTree?

ksr2018 commented 5 years ago

Hi Dan, Thanks for your suggestion. I am able to run plotTree successfully. When I check myobjectname@tree$walks.force.layout[, c("x", "y", "telescope.pt")], it returns NULL. Not sure how to proceed.

farrellja commented 5 years ago

Hi. This is a pernicious bug that I've had trouble finding what I consider an optimal solution for.

I think it results when there are a group of cells that have the same visitation by the random walks as each other. The force-directed layout builds a k-NN based on random visitation; if there are enough cells with the exact same visitation parameters, then there can be cells with 0 distance to all of their nearest neighbors, which results in bizarre behavior (divide by 0s and other such events). I've modified the treeForceDirectedLayout function to now discard those cells in the debug branch. I would like to preserve them and treat them as a special case, but my previous ideas for doing that have not produced nice layouts.

Try reinstalling URD from the debug branch (devtools::install_github(repo="farrellja/URD", ref = "debug")). Let me know if the function now generates a layout successfully. The expectation is that some problematic cells are now discarded, which will be reported in the cells preserved output (i.e. 99.75% of starting cells preserved should now be a lower number compared to previously.) Let me know if that allows the function to proceed and how many extra cells are being thrown out.

ksr2018 commented 5 years ago

Hi, Thanks so much for looking into this. I removed the version of URD I had and reinstalled the debug version, but I still get the same error. The number of cells preserved also remains same as before (99.75%).
Also, FYI I'm not removing poorly visited cells as suggested in the supplementary analysis, as I'm facing errors there too. Do you think this could be adversely affecting the force layout?

farrellja commented 5 years ago

Hi, ksr2018.

That's pretty surprising. I really suspected that was the issue; will you confirm that the newest version installed? Type grep("na.rm", body(treeForceDirectedLayout)). If it is the updated version, it should return 55; otherwise it will return integer(0).

If the newest version didn't install correctly, try this time: devtools::install_github(repo="farrellja/URD@debug") (You shouldn't have to uninstall the old version before running that to save you time.)

If the newest version is installed, then can you send me a reproducible example so that I can try to debug? (i.e. save your workspace and email me a Dropbox/Google link to it and tell me the exact command that you're running that's failing)

ksr2018 commented 5 years ago

@farrellja I have installed the newest version. grep("na.rm", body(treeForceDirectedLayout)) returns 55. I have emailed you a Dropbox link to my workspace. Let me know if you have any issues accessing it. Thanks again for helping me.

ksr2018 commented 5 years ago

Hi @farrellja, just checking if you had the chance to look into this issue?

farrellja commented 5 years ago

This bug is (I think) fixed in 1.0.3.

decarlin commented 5 years ago

This bug is (I think) fixed in 1.0.3.

Just ran into this old bug again!

decarlin commented 5 years ago

Ah never mind it went away when I restarted the session. Probably something in conflict with a previous run. Please ignore

ZhaoxiangSimonCai commented 4 years ago

Hi @farrellja, I now encounter this error at an earlier place in the code. https://github.com/farrellja/URD/blob/master/R/tree-force-layout.R#L113

axial.tree <- treeForceDirectedLayout(axial.tree, num.nn=100, cut.unconnected.segments=2, verbose=T) [1] "2019-09-16 13:36:43 : Starting with parameters fr 100 NN 2 D 8115 cells" [1] "Removing 0 cells that are not assigned a pseudotime or a segment in the tree." [1] "2019-09-16 13:36:43: Preparing walk data." [1] "2019-09-16 13:36:43: Calculating nearest neighbor graph." Error in RANN::nn2(data = walk.data, query = walk.data, k = max(num.nn) + : NA/NaN/Inf in foreign function call (arg 1) In addition: Warning message: In treeForceDirectedLayout(axial.tree, num.nn = 100, cut.unconnected.segments = 2, : 17 cells have duplicate random walk coordinates and are being removed from the layout.

I am already using your dev branch. Any idea on how I can fix this one? Any help is appreciated!

======================================================================

OK I think I just figured out. For some reason, I have a few cells with walk.total = 0, however, they have non-zero pseudotime hence not filtered out at https://github.com/farrellja/URD/blob/master/R/tree-force-layout.R#L86 I pretty much followed the tutorial to build my tree with a different dataset.

I was able to get the layout working with the cells.to.do parameter. This is just FYI in case anything needs to be fixed. Thanks.

farrellja commented 4 years ago

@Olimon660 I added to debug/1.1.0.9004 a parameter cells.min.walked to treeForceDirectedLayout with default of 1 cell that will automatically remove any cells with walk.total = 0. So, it should prevent this problem in the future. But, also it could save people the step of filtering out poorly visited cells (which I did in the zebrafish layout) because now they can just set it to a higher number if they want. Thanks so much for letting me know what you found; I added a thanks to you to the documentation for that function.