Closed DrJoshVandenbrink closed 1 year ago
Attached the same picture 3 times above!
can you kindly share the SIF file?
Thanks for getting back to me so fast!
Here is the sif file, converted to txt so it would upload. athPIN.sif.txt
hello again, I just pushed a fix addressing this issue. You may install the latest dev version via:
install.packages("devtools") # if you have not installed "devtools"
devtools::install_github("egeulgen/pathfindR")
Works great now! Thanks for your help!
Why trying to load a custom pin file, I am receiving a the error:
"The second column of the PIN file must all be "pp"
However
Steps to reproduce the behavior:
Download the PIN String file
URL <- "https://stringdb-static.org/download/protein.links.v11.5/3702.protein.links.v11.5.txt.gz" path2file <- file.path(tempdir(check = TRUE), "STRING.txt.gz") download.file(URL, path2file)
Loading the string file
ath_string_df <- read.table(path2file, header = TRUE)
ath_string_df <- ath_string_df[ath_string_df$combined_score >= 400, ]
Removing the excess around the gene names
ath_string_pin <- data.frame(Interactor_A = sub("^3702\.", "", ath_string_df$protein1), Interactor_B = sub("^3702\.", "", ath_string_df$protein2))
ath_string_pin <- data.frame(Interactor_A = sub("\..$", "", ath_string_pin$Interactor_A), Interactor_B = sub("\..$", "", ath_string_pin$Interactor_B))
Getting the Gene Symbols
Kegg <- org.At.tairSYMBOL mapped_genes <- mappedkeys(Kegg)
symbols <- as.data.frame(Kegg[mapped_genes])
Replacing TAIR IDs with Symbols
ath_string_pin$Interactor_A <- symbols$symbol[match(ath_string_pin$Interactor_A, symbols$gene_id)] ath_string_pin$Interactor_B <- symbols$symbol[match(ath_string_pin$Interactor_B, symbols$gene_id)] ath_string_pin <- ath_string_pin[!is.na(ath_string_pin$Interactor_A) & !is.na(ath_string_pin$Interactor_B), ] ath_string_pin <- ath_string_pin[ath_string_pin$Interactor_A != "" & ath_string_pin$Interactor_B != "", ]
self_intr_cond <- ath_string_pin$Interactor_A == ath_string_pin$Interactor_B ath_string_pin <- ath_string_pin[!self_intr_cond, ]
ath_string_pin <- unique(t(apply(ath_string_pin, 1, sort))) # this will return a matrix object
Adding the "pp" in the center column
data.frame(A = ath_string_pin[, 1], pp = "pp", B = ath_string_pin[, 2])
Saving the SIF file
path2SIF <- file.path(tempdir(), "PIN.sif") write.table(ath_string_pin, file = path2SIF, col.names = FALSE, row.names = FALSE, Error_Screenshot.pdf
path2SIF <- normalizePath(path2SIF)
Running PathfindR
output_df <- run_pathfindR(input = Ler, convert2alias = FALSE, gene_sets = "Custom", custom_genes = ath_kegg_genes, custom_descriptions = ath_kegg_descriptions, pin_name_path = "/tmp/RtmpqdjAO7/PIN.sif")
Expected behavior I am expecting pathviewR to run, however I get the "pp" error, but when looking at my sif file and data.frame, all values of the center column ARE "pp"
Screenshots
Desktop (please complete the following information):
R Session Information: R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.1 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.10.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
[6] LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] stats4 stats graphics grDevices utils datasets methods base
other attached packages: [1] rTRM_1.36.0 igraph_1.4.1 KEGGREST_1.38.0 org.At.tair.db_3.16.0 AnnotationDbi_1.60.0 IRanges_2.32.0
[7] S4Vectors_0.36.1 Biobase_2.58.0 BiocGenerics_0.44.0 readxl_1.4.2 lubridate_1.9.2 forcats_1.0.0
[13] stringr_1.5.0 dplyr_1.1.0 purrr_1.0.1 readr_2.1.4 tidyr_1.3.0 tibble_3.1.8
[19] ggplot2_3.4.1 tidyverse_2.0.0 pathfindR_1.6.4.9000 pathfindR.data_1.1.3
loaded via a namespace (and not attached): [1] bitops_1.0-7 bit64_4.0.5 doParallel_1.0.17 httr_1.4.4 GenomeInfoDb_1.34.9 tools_4.2.2
[7] utf8_1.2.3 R6_2.5.1 DBI_1.1.3 colorspace_2.1-0 withr_2.5.0 tidyselect_1.2.0
[13] gridExtra_2.3 curl_5.0.0 bit_4.0.5 compiler_4.2.2 cli_3.6.0 scales_1.2.1
[19] digest_0.6.31 rmarkdown_2.20 XVector_0.38.0 pkgconfig_2.0.3 htmltools_0.5.4 fastmap_1.1.0
[25] rlang_1.0.6 rstudioapi_0.14 RSQLite_2.3.0 farver_2.1.1 generics_0.1.3 RCurl_1.98-1.10
[31] magrittr_2.0.3 GenomeInfoDbData_1.2.9 Rcpp_1.0.10 munsell_0.5.0 fansi_1.0.4 viridis_0.6.2
[37] lifecycle_1.0.3 stringi_1.7.12 ggraph_2.1.0 MASS_7.3-58.2 zlibbioc_1.44.0 grid_4.2.2
[43] blob_1.2.3 parallel_4.2.2 ggrepel_0.9.3 crayon_1.5.2 graphlayouts_0.8.4 Biostrings_2.66.0
[49] hms_1.1.2 knitr_1.42 pillar_1.8.1 codetools_0.2-19 glue_1.6.2 evaluate_0.20
[55] png_0.1-8 vctrs_0.5.2 tzdb_0.3.0 tweenr_2.0.2 foreach_1.5.2 cellranger_1.1.0
[61] gtable_0.3.1 polyclip_1.10-4 cachem_1.0.6 xfun_0.37 ggforce_0.4.1 tidygraph_1.2.3
[67] viridisLite_0.4.1 iterators_1.0.14 memoise_2.0.1 timechange_0.2.0 ellipsis_0.3.2
Additional context openjdk 11.0.17 2022-10-18 OpenJDK Runtime Environment (build 11.0.17+8-post-Ubuntu-1ubuntu222.04) OpenJDK 64-Bit Server VM (build 11.0.17+8-post-Ubuntu-1ubuntu222.04, mixed mode, sharing)
Thanks in advance!