egeulgen / pathfindR

pathfindR: Enrichment Analysis Utilizing Active Subnetworks
https://egeulgen.github.io/pathfindR/
Other
178 stars 25 forks source link

Duplicated factor level when creating term_gene_heatmap #140

Closed Dane-Z closed 1 year ago

Dane-Z commented 2 years ago

Hello, I'm getting this bug when attempting to make the term_gene_heatmap for GO_BP results. Execution is halted here for GO_BP but not for the other pathways that I've been able to run so far (GO_MF and GO_CC). I checked the clustered term file and found a duplicate entry. Could this be causing the issue? One GO term is listed as a member for two clusters? I'm not sure what would be causing this issue.

Error in levels<-(*tmp*, value = as.character(levels)) : factor level [8] is duplicated Calls: pathfindR_enrichment -> run_pathf -> print -> term_gene_heatmap -> factor Execution halted

Desktop (please complete the following information):

egeulgen commented 2 years ago

Hello,

Would you mind sharing the data frame as an RDS file so I can try to reproduce the issue?

Dane-Z commented 2 years ago

Hello,

Would you mind sharing the data frame as an RDS file so I can try to reproduce the issue?

I'm trying to upload the file in the comment but I'm getting a warning saying that the file type isn't supported

Dane-Z commented 2 years ago

This also occurs sometimes for the output of other pathways. For example, when re-running pathfindR, GO_BP didn't produce this error, but Reactome did. This is on the same dataset with the same parameters (n_iter = 10, min g_set = 10, max g_set = 300)

[1] "enrichment_Reactome"
Number of genes provided in input: 9027
Number of genes in input after p-value filtering: 104
Could not find any interactions for 2 (1.92%) genes in the PIN
Final number of genes in input: 102
Found 86 active subnetworks
Found 195 active subnetworks
Found 195 active subnetworks
Found 158 active subnetworks
.
.
Number of genes provided in input: 9027
Number of genes in input after p-value filtering: 104
Could not find any interactions for 2 (1.92%) genes in the PIN
Final number of genes in input: 102
Number of genes provided in input: 9027
Number of genes in input after p-value filtering: 104
Could not find any interactions for 2 (1.92%) genes in the PIN
Final number of genes in input: 102
Error in `levels<-`(`*tmp*`, value = as.character(levels)) : 
  factor level [9] is duplicated
Calls: pathfindR_enrichment -> run_pathf -> print -> term_gene_heatmap -> factor
egeulgen commented 2 years ago

can you share the command that produced this?

Dane-Z commented 2 years ago
pdf(file = paste0("term_gene_heatmap_", enrichment_db, ".pdf"), width = 11.69, height = 8.27)
    print(term_gene_heatmap(result_df = output_df_pathfindR, genes_df = input_df_pathfindR, use_description = TRUE, 
                          num_terms = 10))
    dev.off()
egeulgen commented 2 years ago

can you zip together the RDS files for output_df_pathfindR and input_df_pathfindR and share it? The zip should be allowed to be uploaded

Dane-Z commented 2 years ago

pathfindR_df.zip

egeulgen commented 2 years ago

I cannot reproduce this issue. Can you try downloading the latest development version and see if this resolves the issue?

install.packages("devtools") # if you have not installed "devtools" 
devtools::install_github("egeulgen/pathfindR")
Dane-Z commented 1 year ago

Sorry, I may have attached one that passed on a different pathway. Here is a zip folder with input/output dateframes. In this example, the error occurs when printing the term gene heatmap for the cell markers pathway. Here is the error: New Folder With Items.zip

Final number of genes in input: 102
Error in `levels<-`(`*tmp*`, value = as.character(levels)) : 

  factor level [7] is duplicated
Calls: pathfindR_enrichment -> run_pathf -> print -> term_gene_heatmap -> factor

In the meantime, I'll install the dev version locally and test again This may take a lot more time though because my local workstation is very slow.

egeulgen commented 1 year ago

Hello again, I couldn't reproduce the issue with this data either. Please try the latest development version and let me know if it persists

Dane-Z commented 1 year ago

Really strange. I am not able to replicate the errors locally either but these appear on our server for multiple members of our team. Could this be a package issue? I've attached the results from session_info() below.

Also, I checked the version that we are using in our server and it is up to date with the latest pathfindR version 1.6.4.


Session info ---------------------------------------------------------------
 setting  value
 version  R version 4.1.2 (2021-11-01)
 os       CentOS Linux 7 (Core)
 system   x86_64, linux-gnu
 ui       X11
 language (EN)
 collate  C
 ctype    C
 tz       America/New_York
 date     2022-09-07
 pandoc   2.11.2 @ /programs/x86_64-linux/system/biogrids_bin/pandoc

- Packages -------------------------------------------------------------------
 package          * version  date (UTC) lib source
 AnnotationDbi    * 1.56.2   2021-11-09 [1] Bioconductor
 ape                5.6-2    2022-03-02 [1] CRAN (R 4.1.1)
 aplot              0.1.6    2022-06-03 [1] CRAN (R 4.1.2)
 assertthat         0.2.1    2019-03-21 [1] CRAN (R 4.1.1)
 backports          1.4.1    2021-12-13 [1] CRAN (R 4.1.1)
 Biobase          * 2.54.0   2021-10-26 [1] Bioconductor
 BiocGenerics     * 0.40.0   2021-10-26 [1] Bioconductor
 BiocParallel       1.28.3   2021-12-09 [1] Bioconductor
 Biostrings         2.62.0   2021-10-26 [1] Bioconductor
 bit                4.0.4    2020-08-04 [1] CRAN (R 4.1.1)
 bit64              4.0.5    2020-08-30 [1] CRAN (R 4.1.1)
 bitops             1.0-7    2021-04-24 [1] CRAN (R 4.1.1)
 blob               1.2.3    2022-04-10 [1] CRAN (R 4.1.1)
 brio               1.1.3    2021-11-30 [1] CRAN (R 4.1.1)
 broom              1.0.0    2022-07-01 [1] CRAN (R 4.1.2)
 cachem             1.0.6    2021-08-19 [1] CRAN (R 4.1.1)
 callr              3.7.2    2022-08-22 [1] CRAN (R 4.1.2)
 cellranger         1.1.0    2016-07-27 [1] CRAN (R 4.1.1)
 cli                3.3.0    2022-04-25 [1] CRAN (R 4.1.1)
 clusterProfiler  * 4.2.2    2022-01-13 [1] Bioconductor
 colorspace         2.0-3    2022-02-21 [1] CRAN (R 4.1.1)
 crayon             1.5.1    2022-03-26 [1] CRAN (R 4.1.1)
 data.table       * 1.14.2   2021-09-27 [1] CRAN (R 4.1.1)
 DBI                1.1.3    2022-06-18 [1] CRAN (R 4.1.2)
 dbplyr             2.2.1    2022-06-27 [1] CRAN (R 4.1.2)
 desc               1.4.1    2022-03-06 [1] CRAN (R 4.1.1)
 devtools         * 2.4.3    2021-11-30 [1] CRAN (R 4.1.1)
 digest             0.6.29   2021-12-01 [1] CRAN (R 4.1.1)
 DO.db              2.9      2020-08-05 [1] Bioconductor
 DOSE               3.20.1   2021-11-18 [1] Bioconductor
 downloader         0.4      2015-07-09 [1] CRAN (R 4.1.1)
 dplyr            * 1.0.9    2022-04-28 [1] CRAN (R 4.1.1)
 ellipsis           0.3.2    2021-04-29 [1] CRAN (R 4.1.1)
 enrichplot       * 1.14.2   2022-02-24 [1] Bioconductor
 fansi              1.0.3    2022-03-24 [1] CRAN (R 4.1.1)
 farver             2.1.1    2022-07-06 [1] CRAN (R 4.1.2)
 fastmap            1.1.0    2021-01-25 [1] CRAN (R 4.1.1)
 fastmatch          1.1-3    2021-07-23 [1] CRAN (R 4.1.1)
 fgsea              1.20.0   2021-10-26 [1] Bioconductor
 forcats          * 0.5.2    2022-08-19 [1] CRAN (R 4.1.2)
 fs                 1.5.2    2021-12-08 [1] CRAN (R 4.1.1)
 gargle             1.2.0    2021-07-02 [1] CRAN (R 4.1.1)
 generics           0.1.3    2022-07-05 [1] CRAN (R 4.1.2)
 GenomeInfoDb       1.30.1   2022-01-30 [1] Bioconductor
 GenomeInfoDbData   1.2.7    2022-05-09 [1] Bioconductor
 ggforce            0.3.4    2022-08-18 [1] CRAN (R 4.1.2)
 ggfun              0.0.6    2022-04-01 [1] CRAN (R 4.1.1)
 ggplot2          * 3.3.6    2022-05-03 [1] CRAN (R 4.1.1)
 ggplotify          0.1.0    2021-09-02 [1] CRAN (R 4.1.1)
 ggraph             2.0.6    2022-08-08 [1] CRAN (R 4.1.2)
 ggrepel            0.9.1    2021-01-15 [1] CRAN (R 4.1.1)
 ggtree             3.2.1    2021-11-16 [1] Bioconductor
 glue               1.6.2    2022-02-24 [1] CRAN (R 4.1.1)
 GO.db              3.14.0   2022-05-09 [1] Bioconductor
 googledrive        2.0.0    2021-07-08 [1] CRAN (R 4.1.1)
 googlesheets4      1.0.1    2022-08-13 [1] CRAN (R 4.1.2)
 GOSemSim           2.20.0   2021-10-26 [1] Bioconductor
 graphlayouts       0.8.1    2022-08-11 [1] CRAN (R 4.1.2)
 gridExtra          2.3      2017-09-09 [1] CRAN (R 4.1.1)
 gridGraphics       0.5-1    2020-12-13 [1] CRAN (R 4.1.1)
 gtable             0.3.0    2019-03-25 [1] CRAN (R 4.1.1)
 haven              2.5.1    2022-08-22 [1] CRAN (R 4.1.2)
 hms                1.1.2    2022-08-19 [1] CRAN (R 4.1.2)
 httr               1.4.4    2022-08-17 [1] CRAN (R 4.1.2)
 igraph             1.3.1    2022-04-20 [1] CRAN (R 4.1.1)
 IRanges          * 2.28.0   2021-10-26 [1] Bioconductor
 jsonlite           1.8.0    2022-02-22 [1] CRAN (R 4.1.1)
 KEGGREST           1.34.0   2021-10-26 [1] Bioconductor
 labeling           0.4.2    2020-10-20 [1] CRAN (R 4.1.1)
 lattice            0.20-45  2021-09-22 [1] CRAN (R 4.1.1)
 lazyeval           0.2.2    2019-03-15 [1] CRAN (R 4.1.1)
 lifecycle          1.0.1    2021-09-24 [1] CRAN (R 4.1.1)
 lubridate          1.8.0    2021-10-07 [1] CRAN (R 4.1.1)
 magrittr         * 2.0.3    2022-03-30 [1] CRAN (R 4.1.1)
 MASS               7.3-58.1 2022-08-03 [1] CRAN (R 4.1.2)
 Matrix             1.4-1    2022-03-23 [1] CRAN (R 4.1.1)
 memoise            2.0.1    2021-11-26 [1] CRAN (R 4.1.1)
 modelr             0.1.9    2022-08-19 [1] CRAN (R 4.1.2)
 munsell            0.5.0    2018-06-12 [1] CRAN (R 4.1.1)
 nlme               3.1-159  2022-08-09 [1] CRAN (R 4.1.2)
 org.Hs.eg.db     * 3.14.0   2022-05-09 [1] Bioconductor
 patchwork          1.1.2    2022-08-19 [1] CRAN (R 4.1.2)
 pillar             1.8.1    2022-08-19 [1] CRAN (R 4.1.2)
 pkgbuild           1.3.1    2021-12-20 [1] CRAN (R 4.1.1)
 pkgconfig          2.0.3    2019-09-22 [1] CRAN (R 4.1.1)
 pkgload            1.2.4    2021-11-30 [1] CRAN (R 4.1.1)
 plyr               1.8.7    2022-03-24 [1] CRAN (R 4.1.1)
 png                0.1-7    2013-12-03 [1] CRAN (R 4.1.1)
 polyclip           1.10-0   2019-03-14 [1] CRAN (R 4.1.1)
 prettyunits        1.1.1    2020-01-24 [1] CRAN (R 4.1.1)
 processx           3.7.0    2022-07-07 [1] CRAN (R 4.1.2)
 ps                 1.7.1    2022-06-18 [1] CRAN (R 4.1.2)
 purrr            * 0.3.4    2020-04-17 [1] CRAN (R 4.1.1)
 qvalue             2.26.0   2021-10-26 [1] Bioconductor
 R6                 2.5.1    2021-08-19 [1] CRAN (R 4.1.1)
 RColorBrewer       1.1-3    2022-04-03 [1] CRAN (R 4.1.1)
 Rcpp               1.0.8.3  2022-03-17 [1] CRAN (R 4.1.1)
 RCurl              1.98-1.8 2022-07-30 [1] CRAN (R 4.1.2)
 readr            * 2.1.2    2022-01-30 [1] CRAN (R 4.1.1)
 readxl             1.4.1    2022-08-17 [1] CRAN (R 4.1.2)
 remotes            2.4.2    2021-11-30 [1] CRAN (R 4.1.1)
 reprex             2.0.2    2022-08-17 [1] CRAN (R 4.1.2)
 reshape2           1.4.4    2020-04-09 [1] CRAN (R 4.1.1)
 rlang              1.0.2    2022-03-04 [1] CRAN (R 4.1.1)
 rprojroot          2.0.3    2022-04-02 [1] CRAN (R 4.1.1)
 RSQLite            2.2.16   2022-08-17 [1] CRAN (R 4.1.2)
 rvest              1.0.3    2022-08-19 [1] CRAN (R 4.1.2)
 S4Vectors        * 0.30.1   2021-09-26 [1] Bioconductor
 scales             1.2.1    2022-08-20 [1] CRAN (R 4.1.2)
 scatterpie         0.1.7    2021-08-20 [1] CRAN (R 4.1.1)
 sessioninfo        1.2.2    2021-12-06 [1] CRAN (R 4.1.1)
 shadowtext         0.1.2    2022-04-22 [1] CRAN (R 4.1.1)
 stringi            1.7.8    2022-07-11 [1] CRAN (R 4.1.2)
 stringr          * 1.4.1    2022-08-20 [1] CRAN (R 4.1.2)
 testthat           3.1.4    2022-04-26 [1] CRAN (R 4.1.1)
 tibble           * 3.1.7    2022-05-03 [1] CRAN (R 4.1.1)
 tidygraph          1.2.2    2022-08-22 [1] CRAN (R 4.1.2)
 tidyr            * 1.2.0    2022-02-01 [1] CRAN (R 4.1.1)
 tidyselect         1.1.2    2022-02-21 [1] CRAN (R 4.1.1)
 tidytree           0.4.0    2022-08-13 [1] CRAN (R 4.1.2)
 tidyverse        * 1.3.2    2022-07-18 [1] CRAN (R 4.1.2)
 treeio             1.18.1   2021-11-14 [1] Bioconductor
 tweenr             2.0.1    2022-08-22 [1] CRAN (R 4.1.2)
 tzdb               0.3.0    2022-03-28 [1] CRAN (R 4.1.1)
 usethis          * 2.1.6    2022-05-25 [1] CRAN (R 4.1.2)
 utf8               1.2.2    2021-07-24 [1] CRAN (R 4.1.1)
 vctrs              0.4.1    2022-04-13 [1] CRAN (R 4.1.1)
 viridis            0.6.2    2021-10-13 [1] CRAN (R 4.1.1)
 viridisLite        0.4.1    2022-08-22 [1] CRAN (R 4.1.2)
 withr              2.5.0    2022-03-03 [1] CRAN (R 4.1.1)
 xml2               1.3.3    2021-11-30 [1] CRAN (R 4.1.1)
 XVector            0.34.0   2021-10-26 [1] Bioconductor
 yulab.utils        0.0.5    2022-06-30 [1] CRAN (R 4.1.2)
 zlibbioc           1.40.0   2021-10-26 [1] Bioconductor
egeulgen commented 1 year ago

can you debug by running term_gene_heatmap() after running debugonce(term_gene_heatmap) and tell me on which line the error occurs? This might help us pinpoint the issue