Open emiliesecherre opened 4 years ago
Thanks for the reproducible bug report. It seems to be working for me. Here is what I ran. What versions of paxtoolsr and XML are you running?
> library(paxtoolsr)
Loading required package: rJava
Loading required package: XML
Consider citing this package: Luna A, et al. PaxtoolsR: pathway analysis in R using Pathway Commons. PMID: 26685306; citation("paxtoolsr")
> library(XML)
>
> uri <- "http://identifiers.org/reactome/R-HSA-1369062"
> xml <- getPc(uri, format = "BIOPAX")
>
> saveXML(xml, "del.xml")
[1] "del.xml"
>
> sessionInfo()
R version 3.6.2 (2019-12-12)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Catalina 10.15.3
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
Random number generation:
RNG: Mersenne-Twister
Normal: Inversion
Sample: Rounding
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] paxtoolsr_1.21.1 XML_3.98-1.20 rJava_0.9-11
loaded via a namespace (and not attached):
[1] igraph_1.2.4.2 Rcpp_1.0.3 rstudioapi_0.10 magrittr_1.5
[5] knitr_1.26 hms_0.5.3 rjson_0.2.20 R6_2.4.1
[9] rlang_0.4.2 plyr_1.8.5 httr_1.4.1.9000 tools_3.6.2
[13] xfun_0.11 R.oo_1.23.0 htmltools_0.4.0.9002 yaml_2.2.0
[17] digest_0.6.23 tibble_2.1.3 crayon_1.3.4 readr_1.3.1
[21] vctrs_0.2.1 R.utils_2.9.2 curl_4.3 zeallot_0.1.0
[25] evaluate_0.14 rmarkdown_2.0 compiler_3.6.2 pillar_1.4.3
[29] backports_1.1.5 R.methodsS3_1.7.1 jsonlite_1.6.9000 pkgconfig_2.0.3
I have this : ` R version 3.6.3 (2020-02-29) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 18363)
Matrix products: default
locale: [1] LC_COLLATE=French_France.1252 LC_CTYPE=French_France.1252 LC_MONETARY=French_France.1252 [4] LC_NUMERIC=C LC_TIME=French_France.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] dplyr_0.8.5 MPINet_1.0 mgcv_1.8-31 nlme_3.1-145
[5] BiasedUrn_1.07 lilikoi_0.1.0 pathfindR_1.4.2 stringr_1.4.0
[9] gprofiler2_0.1.8 plyr_1.8.6 paxtoolsr_1.20.0 XML_3.99-0.3
[13] rJava_0.9-12 clusterProfiler_3.14.3 rWikiPathways_1.6.1 graphite_1.32.0
loaded via a namespace (and not attached):
[1] backports_1.1.6 Hmisc_4.4-0 fastmatch_1.1-0 corrplot_0.84 igraph_1.2.5
[6] lazyeval_0.2.2 splines_3.6.3 BiocParallel_1.20.1 ggplot2_3.3.0 urltools_1.7.3
[11] digest_0.6.25 foreach_1.5.0 htmltools_0.4.0 GOSemSim_2.12.1 viridis_0.5.1
[16] GO.db_3.10.0 fansi_0.4.1 magrittr_1.5 checkmate_2.0.0 memoise_1.1.0
[21] cluster_2.1.0 doParallel_1.0.15 recipes_0.1.10 readr_1.3.1 graphlayouts_0.6.0
[26] gower_0.2.1 R.utils_2.9.2 enrichplot_1.6.1 prettyunits_1.1.1 jpeg_0.1-8.1
[31] princurve_2.1.4 colorspace_1.4-1 blob_1.2.1 rappdirs_0.3.1 ggrepel_0.8.2
[36] xfun_0.12 crayon_1.3.4 RCurl_1.98-1.1 RWeka_0.4-42 jsonlite_1.6.1
[41] graph_1.64.0 survival_3.1-12 iterators_1.0.12 glue_1.4.0 polyclip_1.10-0
[46] gtable_0.3.0 ipred_0.9-9 BiocGenerics_0.32.0 scales_1.1.0 DOSE_3.12.0
[51] infotheo_1.2.0 DBI_1.1.0 Rcpp_1.0.4 htmlTable_1.13.3 viridisLite_0.3.0
[56] progress_1.2.2 gridGraphics_0.5-0 foreign_0.8-76 bit_1.1-15.2 europepmc_0.3
[61] Formula_1.2-3 lava_1.6.7 prodlim_2019.11.13 stats4_3.6.3 htmlwidgets_1.5.1
[66] httr_1.4.1 fgsea_1.12.0 RColorBrewer_1.1-2 acepack_1.4.1 ellipsis_0.3.0
[71] pkgconfig_2.0.3 R.methodsS3_1.8.0 farver_2.0.3 nnet_7.3-13 utf8_1.1.4
[76] caret_6.0-86 ggplotify_0.0.5 tidyselect_1.0.0 rlang_0.4.5 reshape2_1.4.4
[81] AnnotationDbi_1.48.0 munsell_0.5.0 tools_3.6.3 cli_2.0.2 generics_0.0.2
[86] RSQLite_2.2.0 ggridges_0.5.2 evaluate_0.14 ModelMetrics_1.2.2.2 knitr_1.28
[91] bit64_0.9-7 tidygraph_1.1.2 caTools_1.18.0 purrr_0.3.3 ggraph_2.0.2
[96] R.oo_1.23.0 DO.db_2.9 xml2_1.3.1 compiler_3.6.3 rstudioapi_0.11
[101] png_0.1-7 plotly_4.9.2.1 curl_4.3 tibble_3.0.0 tweenr_1.0.1
[106] stringi_1.4.6 lattice_0.20-41 Matrix_1.2-18 gbm_2.1.5 RWekajars_3.9.3-2
[111] vctrs_0.2.4 pillar_1.4.3 lifecycle_0.2.0 BiocManager_1.30.10 triebeard_0.3.0
[116] data.table_1.12.8 cowplot_1.0.0 bitops_1.0-6 qvalue_2.18.0 latticeExtra_0.6-29
[121] R6_2.4.1 gridExtra_2.3 IRanges_2.20.2 codetools_0.2-16 MASS_7.3-51.5
[126] assertthat_0.2.1 rjson_0.2.20 withr_2.1.2 S4Vectors_0.24.3 parallel_3.6.3
[131] hms_0.5.3 grid_3.6.3 rpart_4.1-15 timeDate_3043.102 tidyr_1.0.2
[136] class_7.3-16 rmarkdown_2.1 rvcheck_0.1.8 pROC_1.16.2 ggforce_0.3.1
[141] base64enc_0.1-3 lubridate_1.7.8 Biobase_2.46.0
`
When i try your code i get this : `
uri <- "http://identifiers.org/reactome/R-HSA-1369062" xml <- getPc(uri, format = "BIOPAX") Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0x3C 0x2F 0x62 Erreur : 1: Input is not proper UTF-8, indicate encoding ! Bytes: 0xE9 0x3C 0x2F 0x62 saveXML(xml, "del.xml") Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘saveXML’ for signature ‘"function"’ `
I also noticed that this error appear when i try to get BioPax from Reactome and SMPDB, it works with Panther and PathwaysCommon
I updated my XML package (XML_3.99-0.3) without new problems and I don't think my small update to paxtoolsr would have affected XML. This is a strange error because getPc() should give you an XML package object so saving it shouldn't be a problem. Can you do any of the above to try to see what is in the variable returned from getPc?
library(paxtoolsr)
library(XML)
library(xml2)
uri <- "http://identifiers.org/reactome/R-HSA-1369062"
xml <- getPc(uri, format = "BIOPAX")
saveXML(xml, "del.xml")
str(xml)
tmp <- XML::toString.XMLNode(xml)
writeLines(as.character(tmp), "del_xml.txt")
s <- xml2::read_xml(tmp)
writeLines(as.character(s), "del_xml2.txt")
sessionInfo()
The issue is the same, as i have an error with getPc it doesn't create the xml variable, so the rest of the code doesn't work either..
What about this? The package functions do various routine things to generate the link below and then read in the XML. Below we get to the main thing that happens.
library(httr)
req <- GET('http://www.pathwaycommons.org/pc2/get?uri=http%3A%2F%2Fidentifiers.org%2Freactome%2FR-HSA-1369062&format=BIOPAX')
text <- content(req, "text")
str(text)
writeLines(text, "del.txt")
It worked ! I think i understood where the issue was, the biopax file contained a lot of european chars (à,é,è) so i had to remove them to make the function toSif work. So I just have to change the Reactome identifier in the url of the GET function if i want to do some analysis with an other pathway ?
Short answer: Okay. Yes, you can change the URL and only use the parts of paxtoolsr that you need. Ultimately, paxtoolsr functions are a set of "opinions" and "helpers" for how to best use the underlying API that you are accessing with that URL within R.
Longer answer: Can you paste the "bad" XML you get as a Gist (https://gist.github.com/)? There might be a simple solution for Windows, but I don't have easy access to a Windows machine. It might be as easy as an additional parameter I need to add to one function.
Hello, I would be glad to but i'm really not awared about how gist work, i send you here the drive link : https://drive.google.com/file/d/1wOoqNXrZyZTzQ_3mI2YDQSwUO4MiZTME/view?usp=sharing . To be more precise, it was some char in author's names which caused issues. Thank you for your help !
Thanks. What do you see if you run this:
Sys.getlocale('LC_CTYPE')
for me, the return value is "en_US.UTF-8"
Does this message about parsing XML in R with non-Latin characters help? https://stackoverflow.com/questions/38612603/encoding-issue-when-parsing-xml-in-r for the original getPc command?
Here is more information on locales: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/locales
I got this : `
Sys.getlocale('LC_CTYPE') [1] "French_France.1252"`
The function "stringi::stri_conv()" didn't help, and when i try to use Sys.setlocale('LC_CTYPE', 'en_US.UTF-8) there's a warning...
If it helps, i used this : iconv(text, from = 'UTF-8', to = 'ASCII//TRANSLIT') and it seems to fix most characters issues ! I use to get that mistake thought (independant i think) :
2020-04-15 19:12:44,932 343127 [main] INFO org.biopax.paxtools.PaxtoolsMain - toSif: not blacklisting ubiquitous molecules (no blacklist.txt found)
Is that a problem ? I get sif files even with this issue. However, some Biopax files doesn't produce sif files, i don't really understand why.
Hello, When trying to get Biopax file of the pathway R-HSA-1369062 i got this error :
`