egeulgen / pathfindR

pathfindR: Enrichment Analysis Utilizing Active Subnetworks
I am having an issue with pin_name_path #157

DrJoshVandenbrink commented 1 year ago

Why trying to load a custom pin file, I am receiving a the error:

"The second column of the PIN file must all be "pp"


Steps to reproduce the behavior:

Download the PIN String file

URL <- "" path2file <- file.path(tempdir(check = TRUE), "STRING.txt.gz") download.file(URL, path2file)

Loading the string file

ath_string_df <- read.table(path2file, header = TRUE)

ath_string_df <- ath_string_df[ath_string_df$combined_score >= 400, ]

Removing the excess around the gene names

ath_string_pin <- data.frame(Interactor_A = sub("^3702\.", "", ath_string_df$protein1), Interactor_B = sub("^3702\.", "", ath_string_df$protein2))

ath_string_pin <- data.frame(Interactor_A = sub("\..$", "", ath_string_pin$Interactor_A), Interactor_B = sub("\..$", "", ath_string_pin$Interactor_B))

Getting the Gene Symbols

Kegg <- org.At.tairSYMBOL mapped_genes <- mappedkeys(Kegg)

symbols <-[mapped_genes])

Replacing TAIR IDs with Symbols

ath_string_pin$Interactor_A <- symbols$symbol[match(ath_string_pin$Interactor_A, symbols$gene_id)] ath_string_pin$Interactor_B <- symbols$symbol[match(ath_string_pin$Interactor_B, symbols$gene_id)] ath_string_pin <- ath_string_pin[!$Interactor_A) & !$Interactor_B), ] ath_string_pin <- ath_string_pin[ath_string_pin$Interactor_A != "" & ath_string_pin$Interactor_B != "", ]

self_intr_cond <- ath_string_pin$Interactor_A == ath_string_pin$Interactor_B ath_string_pin <- ath_string_pin[!self_intr_cond, ]

ath_string_pin <- unique(t(apply(ath_string_pin, 1, sort))) # this will return a matrix object

Adding the "pp" in the center column

data.frame(A = ath_string_pin[, 1], pp = "pp", B = ath_string_pin[, 2])

Saving the SIF file

path2SIF <- file.path(tempdir(), "PIN.sif") write.table(ath_string_pin, file = path2SIF, col.names = FALSE, row.names = FALSE, Error_Screenshot.pdf

        sep = "\t",
        quote = FALSE)

path2SIF <- normalizePath(path2SIF)

Running PathfindR

output_df <- run_pathfindR(input = Ler, convert2alias = FALSE, gene_sets = "Custom", custom_genes = ath_kegg_genes, custom_descriptions = ath_kegg_descriptions, pin_name_path = "/tmp/RtmpqdjAO7/PIN.sif")

Expected behavior I am expecting pathviewR to run, however I get the "pp" error, but when looking at my sif file and data.frame, all values of the center column ARE "pp"

Desktop (please complete the following information):

R Session Information: R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.1 LTS

Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/ LAPACK: /usr/lib/x86_64-linux-gnu/lapack/


attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] rTRM_1.36.0 igraph_1.4.1 KEGGREST_1.38.0 org.At.tair.db_3.16.0 AnnotationDbi_1.60.0 IRanges_2.32.0
[7] S4Vectors_0.36.1 Biobase_2.58.0 BiocGenerics_0.44.0 readxl_1.4.2 lubridate_1.9.2 forcats_1.0.0
[13] stringr_1.5.0 dplyr_1.1.0 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0 tibble_3.1.8
[19] ggplot2_3.4.1 tidyverse_2.0.0 pathfindR_1.6.4.9000 pathfindR.data_1.1.3

loaded via a namespace (and not attached): [1] bitops_1.0-7 bit64_4.0.5 doParallel_1.0.17 httr_1.4.4 GenomeInfoDb_1.34.9 tools_4.2.2
[7] utf8_1.2.3 R6_2.5.1 DBI_1.1.3 colorspace_2.1-0 withr_2.5.0 tidyselect_1.2.0
[13] gridExtra_2.3 curl_5.0.0 bit_4.0.5 compiler_4.2.2 cli_3.6.0 scales_1.2.1
[19] digest_0.6.31 rmarkdown_2.20 XVector_0.38.0 pkgconfig_2.0.3 htmltools_0.5.4 fastmap_1.1.0
[25] rlang_1.0.6 rstudioapi_0.14 RSQLite_2.3.0 farver_2.1.1 generics_0.1.3 RCurl_1.98-1.10
[31] magrittr_2.0.3 GenomeInfoDbData_1.2.9 Rcpp_1.0.10 munsell_0.5.0 fansi_1.0.4 viridis_0.6.2
[37] lifecycle_1.0.3 stringi_1.7.12 ggraph_2.1.0 MASS_7.3-58.2 zlibbioc_1.44.0 grid_4.2.2
[43] blob_1.2.3 parallel_4.2.2 ggrepel_0.9.3 crayon_1.5.2 graphlayouts_0.8.4 Biostrings_2.66.0
[49] hms_1.1.2 knitr_1.42 pillar_1.8.1 codetools_0.2-19 glue_1.6.2 evaluate_0.20
[55] png_0.1-8 vctrs_0.5.2 tzdb_0.3.0 tweenr_2.0.2 foreach_1.5.2 cellranger_1.1.0
[61] gtable_0.3.1 polyclip_1.10-4 cachem_1.0.6 xfun_0.37 ggforce_0.4.1 tidygraph_1.2.3
[67] viridisLite_0.4.1 iterators_1.0.14 memoise_2.0.1 timechange_0.2.0 ellipsis_0.3.2

openjdk 11.0.17 2022-10-18 OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu222.04) OpenJDK 64-Bit Server VM (build 11.0.17+8-post-Ubuntu-1ubuntu222.04, mixed mode, sharing)

Thanks in advance!

DrJoshVandenbrink commented 1 year ago

egeulgen commented 1 year ago

can you kindly share the SIF file?

DrJoshVandenbrink commented 1 year ago

Thanks for getting back to me so fast!

Here is the sif file, converted to txt so it would upload. athPIN.sif.txt

egeulgen commented 1 year ago

hello again, I just pushed a fix addressing this issue. You may install the latest dev version via:

install.packages("devtools") # if you have not installed "devtools" 
DrJoshVandenbrink commented 1 year ago

Works great now! Thanks for your help!