Open missuse opened 3 years ago
Hi,
first I'd like to thank you for a great package. I have been using it for years.
To the problem:
A simple dataset:
library(phyloseq) data(enterotype) sub_enter <- subset_samples(enterotype, SampleID %in% c("DA.AD.4", "ES.AD.1", "ES.AD.2", "ES.AD.3", "FR.AD.1", "FR.AD.2"))
distance(sub_enter, "jaccard")
DA.AD.4 ES.AD.1 ES.AD.2 ES.AD.3 FR.AD.1 ES.AD.1 0.7697417 ES.AD.2 0.5515371 0.7112008 ES.AD.3 0.5972919 0.5868003 0.3124345 FR.AD.1 0.4585093 0.6707855 0.5268626 0.5223181 FR.AD.2 0.4301436 0.6708381 0.5342431 0.5480318 0.2894279
Here we see that the most similar samples are FR.AD.1 and FR.AD.2 ( 0.2894279) as well as ES.AD.2 and ES.AD.3 (0.3124345)
set.seed(1) plot_net(sub_enter, "jaccard", point_label = "SampleID")
On the graph ES.AD.2 and ES.AD.3 are not connected at all. On the other hand FR.AD.1 is connected to ES.AD.2 (distance 0.5268626)
Could you explain why?
EDIT:
I trust this would be the correct graph:
The problem I believe is in lines:
link_layout = function(LinksData, vertexDT) { linkstart = copy(vertexDT[LinksData$v1, x, y]) linkend = copy(vertexDT[LinksData$v2, x, y]) setnames(linkend, old = c("y", "x"), new = c("yend", "xend")) LinksData <- copy(cbind(LinksData, linkstart, linkend)) return(LinksData) }
These should be changed to
link_layout = function(LinksData, vertexDT) { linkstart = copy(vertexDT[LinksData$v1, 1:2]) linkend = copy(vertexDT[LinksData$v2, 1:2]) setnames(linkend, old = c("y", "x"), new = c("yend", "xend")) LinksData <- copy(cbind(LinksData, linkstart, linkend)) return(LinksData) }
current function example
plot_net(sub_enter, "jaccard", point_label = "SampleID", maxdist = 0.5)`
changed function with the above mentioned lines:
I think the latter makes much more sense in the context of the distance matrix used
also
plot_network( make_network(sub_enter, type = "samples", "jaccard", max.dist = 0.5))
Seems to be coherent with the output of the modified plot_net() function.
plot_net()
sessionInfo()
R version 4.0.2 (2020-06-22) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363) Matrix products: default locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 LC_MONETARY=English_United States.1252 LC_NUMERIC=C LC_TIME=English_United States.1252 attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base other attached packages: [1] ggraph_2.0.4 gdtools_0.2.2 ggiraph_0.7.8 data.table_1.13.0 igraph_1.2.5 ggalt_0.4.0 plotly_4.9.2.1 textclean_0.9.3 ggrepel_0.8.2 [10] cowplot_1.1.0 phyloseq_1.32.0 vegan_2.5-6 lattice_0.20-41 permute_0.9-5 kableExtra_1.3.1 dada2_1.16.0 Rcpp_1.0.5 forcats_0.5.0 [19] stringr_1.4.0 dplyr_1.0.2 purrr_0.3.4 readr_1.3.1 tidyr_1.1.2 tibble_3.0.3 ggplot2_3.3.2 tidyverse_1.3.0 msa_1.20.1 [28] Biostrings_2.56.0 XVector_0.28.0 IRanges_2.22.2 S4Vectors_0.26.1 BiocGenerics_0.34.0 loaded via a namespace (and not attached): [1] uuid_0.1-4 readxl_1.3.1 backports_1.1.10 fastmatch_1.1-0 systemfonts_0.3.2 BiocFileCache_1.12.1 [7] plyr_1.8.6 lazyeval_0.2.2 splines_4.0.2 BiocParallel_1.22.0 GenomeInfoDb_1.24.2 digest_0.6.25 [13] foreach_1.5.0 htmltools_0.5.0 viridis_0.5.1 fansi_0.4.1 magrittr_1.5 memoise_1.1.0 [19] cluster_2.1.0 DECIPHER_2.16.1 graphlayouts_0.7.1 modelr_0.1.8 extrafont_0.17 RcppParallel_5.0.2 [25] matrixStats_0.57.0 extrafontdb_1.0 askpass_1.1 prettyunits_1.1.1 jpeg_0.1-8.1 colorspace_1.4-1 [31] rvest_0.3.6 blob_1.2.1 rappdirs_0.3.1 haven_2.3.1 xfun_0.18 crayon_1.3.4 [37] RCurl_1.98-1.2 jsonlite_1.7.1 survival_3.1-12 phangorn_2.5.5 iterators_1.0.12 ape_5.4-1 [43] glue_1.4.2 polyclip_1.10-0 gtable_0.3.0 zlibbioc_1.34.0 webshot_0.5.2 DelayedArray_0.14.1 [49] proj4_1.0-10 Rhdf5lib_1.10.1 Rttf2pt1_1.3.8 biomartr_0.9.2 maps_3.3.0 scales_1.1.1 [55] qdapRegex_0.7.2 DBI_1.1.0 viridisLite_0.3.0 progress_1.2.2 bit_4.0.4 htmlwidgets_1.5.2 [61] httr_1.4.2 RColorBrewer_1.1-2 ellipsis_0.3.1 farver_2.0.3 pkgconfig_2.0.3 XML_3.99-0.5 [67] dbplyr_1.4.4 utf8_1.1.4 labeling_0.3 tidyselect_1.1.0 rlang_0.4.7 reshape2_1.4.4 [73] AnnotationDbi_1.50.3 munsell_0.5.0 cellranger_1.1.0 tools_4.0.2 cli_2.0.2 generics_0.0.2 [79] RSQLite_2.2.1 ade4_1.7-15 broom_0.7.1 evaluate_0.14 biomformat_1.16.0 yaml_2.2.1 [85] fs_1.5.0 knitr_1.30 bit64_4.0.5 tidygraph_1.2.0 nlme_3.1-148 ash_1.0-15 [91] xml2_1.3.2 biomaRt_2.44.1 compiler_4.0.2 rstudioapi_0.11 curl_4.3 png_0.1-7 [97] reprex_0.3.0 tweenr_1.0.1 stringi_1.5.3 Matrix_1.2-18 multtest_2.44.0 vctrs_0.3.4 [103] pillar_1.4.6 lifecycle_0.2.0 bitops_1.0-6 GenomicRanges_1.40.0 R6_2.4.1 latticeExtra_0.6-29 [109] hwriter_1.3.2 ShortRead_1.46.0 gridExtra_2.3 KernSmooth_2.23-17 writexl_1.3.1 codetools_0.2-16 [115] MASS_7.3-51.6 assertthat_0.2.1 rhdf5_2.32.4 SummarizedExperiment_1.18.2 openssl_1.4.3 withr_2.3.0 [121] GenomicAlignments_1.24.0 Rsamtools_2.4.0 GenomeInfoDbData_1.2.3 mgcv_1.8-31 hms_0.5.3 quadprog_1.5-8 [127] grid_4.0.2 rmarkdown_2.4 ggforce_0.3.2 Biobase_2.48.0 lubridate_1.7.9 tinytex_0.26
All the best,
Milan
Thank you Milan! I was noticing a similar effect when using the plot_net function on my metabarcoding data, and your code fixed it!
Hi,
first I'd like to thank you for a great package. I have been using it for years.
To the problem:
A simple dataset:
distance(sub_enter, "jaccard")
output
Here we see that the most similar samples are FR.AD.1 and FR.AD.2 ( 0.2894279) as well as ES.AD.2 and ES.AD.3 (0.3124345)
On the graph ES.AD.2 and ES.AD.3 are not connected at all. On the other hand FR.AD.1 is connected to ES.AD.2 (distance 0.5268626)
Could you explain why?
EDIT:
I trust this would be the correct graph:
The problem I believe is in lines:
These should be changed to
current function example
changed function with the above mentioned lines:
I think the latter makes much more sense in the context of the distance matrix used
distance(sub_enter, "jaccard")
also
Seems to be coherent with the output of the modified
plot_net()
function.sessionInfo()
All the best,
Milan