lgatto / rpx

R Interface to the ProteomeXchange Repository
http://lgatto.github.io/rpx/
5 stars 2 forks source link

Problem to use pxget function #4

Closed kelgoncalves closed 5 years ago

kelgoncalves commented 6 years ago

Hi,

I am having the followingproblem when trying to follow the vignette for RforProteomics vignette:

Downloading the mzTab data

mztab <- pxget(px1, pxfiles(px1)[5]) #also tried with the file name, but it doesnt make any difference" Downloading 1 file trying URL 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/PXD000001_mztab.txt' Error in download.file(urls[i], toget[i], ...) : cannot open URL 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/PXD000001_mztab.txt' In addition: Warning message: In download.file(urls[i], toget[i], ...) : InternetOpenUrl failed: 'The login request was denied

sessionInfo() R version 3.5.1 (2018-07-02) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] C

attached base packages: [1] stats4 parallel stats graphics grDevices utils datasets methods base

other attached packages: [1] msdata_0.20.0 reshape2_1.4.3 ggplot2_3.0.0 RColorBrewer_1.1-2
[5] MSGFgui_1.14.0 xlsx_0.6.1 MSGFplus_1.14.0 rTANDEM_1.20.0
[9] data.table_1.11.4 Rdisop_1.40.0 RcppClassic_0.9.11 GO.db_3.6.0
[13] org.Hs.eg.db_3.6.0 AnnotationDbi_1.42.1 BRAIN_1.26.0 lattice_0.20-35
[17] Biostrings_2.48.0 XVector_0.20.0 IRanges_2.14.10 S4Vectors_0.18.3
[21] PolynomF_1.0-2 hpar_1.22.2 rols_2.8.1 IPPD_1.28.0
[25] bitops_1.0-6 digest_0.6.15 XML_3.98-1.12 Matrix_1.2-14
[29] MASS_7.3-50 MALDIquantForeign_0.11.1 MALDIquant_1.18 rpx_1.16.0
[33] mzID_1.18.0 MSnbase_2.6.1 ProtGenerics_1.12.0 BiocParallel_1.14.2
[37] Biobase_2.40.0 BiocGenerics_0.26.0 mzR_2.14.0 Rcpp_0.12.17
[41] BiocInstaller_1.30.0 DT_0.4

loaded via a namespace (and not attached): [1] bit64_0.9-7 doParallel_1.0.11 progress_1.2.0 httr_1.3.1
[5] rprojroot_1.3-2 backports_1.1.2 tools_3.5.1 R6_2.2.2
[9] affyio_1.50.0 DBI_1.0.0 lazyeval_0.2.1 colorspace_1.3-2
[13] withr_2.1.2 prettyunits_1.0.2 curl_3.2 bit_1.1-14
[17] compiler_3.5.1 preprocessCore_1.42.0 readBrukerFlexData_1.8.5 xml2_1.2.0
[21] labeling_0.3 scales_0.5.0 affy_1.58.0 stringr_1.3.1
[25] rmarkdown_1.10 base64enc_0.1-3 pkgconfig_2.0.1 htmltools_0.3.6
[29] limma_3.36.2 htmlwidgets_1.2 rlang_0.2.1 RSQLite_2.1.1
[33] impute_1.54.0 shiny_1.1.0 bindr_0.1.1 jsonlite_1.5
[37] RCurl_1.95-4.11 magrittr_1.5 munsell_0.5.0 vsn_3.48.1
[41] stringi_1.1.7 yaml_2.1.19 zlibbioc_1.26.0 plyr_1.8.4
[45] shinyFiles_0.7.0 readMzXmlData_2.8.1 grid_3.5.1 blob_1.1.1
[49] promises_1.0.1 crayon_1.3.4 xlsxjars_0.6.1 hms_0.4.2
[53] knitr_1.20 pillar_1.3.0 codetools_0.2-15 evaluate_0.11
[57] pcaMethods_1.72.0 httpuv_1.4.4.2 foreach_1.4.4 gtable_0.2.0
[61] assertthat_0.2.0 mime_0.5 xtable_1.8-2 later_0.7.3
[65] tibble_1.4.2 rJava_0.9-10 iterators_1.0.10 memoise_1.1.0

lgatto commented 6 years ago

Could you try again, please. It might be an intermittent issue with the server at the EBI. I tried just now and it worked.

> library(rpx)
> ?PXDataset
> px1 <- PXDataset("PXD000001")
> pxfiles(px1)
 [1] "F063721.dat"                                                         
 [2] "F063721.dat-mztab.txt"                                               
 [3] "PRIDE_Exp_Complete_Ac_22134.xml.gz"                                  
 [4] "PRIDE_Exp_mzData_Ac_22134.xml.gz"                                    
 [5] "PXD000001_mztab.txt"                                                 
 [6] "README.txt"                                                          
 [7] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML" 
 [8] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzXML"
 [9] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzXML"         
[10] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.raw"           
[11] "erwinia_carotovora.fasta"                                            
[12] "generated"                                                           
> mztab <- pxget(px1, pxfiles(px1)[5])
Downloading 1 file
/home/lg390/tmp/PXD000001_mztab.txt already present.
mamut343 commented 5 years ago

Hi lgatto,

I have the same problem as "kelgoncalves". I can download the file myself by pasting the ftp address in chrome browser. The same thing could not be done by using the internet download manager (IDM). Just wonder whether it is due to EBI lock request from software like IDM or sth like that.

Please let me know. Cheers

lgatto commented 5 years ago

Hi @mamut343

I can't reproduce this:

> library("MSnbase")
> library("rpx")
> px1 <- PXDataset("PXD000001")
> pxfiles(px1)
 [1] "F063721.dat"                                                         
 [2] "F063721.dat-mztab.txt"                                               
 [3] "PRIDE_Exp_Complete_Ac_22134.xml.gz"                                  
 [4] "PRIDE_Exp_mzData_Ac_22134.xml.gz"                                    
 [5] "PXD000001_mztab.txt"                                                 
 [6] "README.txt"                                                          
 [7] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML" 
 [8] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzXML"
 [9] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzXML"         
[10] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.raw"           
[11] "erwinia_carotovora.fasta"                                            
[12] "generated"                                                           
> (f <- pxfiles(px1)[7])
[1] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML"
> pxget(px1, f)
Downloading 1 file
trying URL 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML'
Content type 'unknown' length 450032788 bytes (429.2 MB)
==================================================
> readMSData(f, mode = "onDisk")
MSn experiment data ("OnDiskMSnExp")
Object size in memory: 3.02 Mb
- - - Spectra data - - -
 MS level(s): 1 2 
 Number of spectra: 7534 
 MSn retention times: 0:0 - 60:2 minutes
- - - Processing information - - -
Data loaded [Mon Apr 15 14:19:41 2019] 
 MSnbase version: 2.9.3 
- - - Meta data  - - -
phenoData
  rowNames:
    TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML
  varLabels: sampleNames
  varMetadata: labelDescription
Loaded from:
  TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML 
protocolData: none
featureData
  featureNames: F1.S0001 F1.S0002 ... F1.S7534 (7534 total)
  fvarLabels: fileIdx spIdx ... spectrum (30 total)
  fvarMetadata: labelDescription
experimentData: use 'experimentData(object)'

Could you try again, paste the full code that leads to the error, and provide the output of sessionInfo().

mamut343 commented 5 years ago

Hi @lgatto

I paste exactly the R code as you provided. Here is the result

> library("MSnbase")
> library("rpx")
> px1 <- PXDataset("PXD000001")
> pxfiles(px1)
 [1] "F063721.dat"                                                         
 [2] "F063721.dat-mztab.txt"                                               
 [3] "PRIDE_Exp_Complete_Ac_22134.xml.gz"                                  
 [4] "PRIDE_Exp_mzData_Ac_22134.xml.gz"                                    
 [5] "PXD000001_mztab.txt"                                                 
 [6] "README.txt"                                                          
 [7] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML" 
 [8] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzXML"
 [9] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.mzXML"         
[10] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01.raw"           
[11] "erwinia_carotovora.fasta"                                            
[12] "generated"                                                           
> (f <- pxfiles(px1)[7])
[1] "TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML"
> pxget(px1, f)
Downloading 1 file
trying URL 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML'
Error in download.file(urls[i], toget[i], ...) : 
  cannot open URL 'ftp://ftp.pride.ebi.ac.uk/pride/data/archive/2012/03/PXD000001/TMT_Erwinia_1uLSike_Top10HCD_isol2_45stepped_60min_01-20141210.mzML'
In addition: Warning message:
In download.file(urls[i], toget[i], ...) :
  InternetOpenUrl failed: 'The login request was denied'

And the sessionInfo()output

> R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18362)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
 [1] rTANDEM_1.22.1      data.table_1.12.2   XML_3.98-1.19       MSnID_1.16.1       
 [5] shiny_1.3.1         MSGFgui_1.16.1      xlsx_0.6.1          rpx_1.18.1         
 [9] MSGFplus_1.16.1     MSnbase_2.8.3       ProtGenerics_1.14.0 S4Vectors_0.20.1   
[13] mzR_2.16.2          Rcpp_1.0.1          Biobase_2.42.0      BiocGenerics_0.28.0
[17] mzID_1.20.1        

loaded via a namespace (and not attached):
 [1] vsn_3.50.0            shinyFiles_0.7.2      jsonlite_1.6         
 [4] foreach_1.4.4         R.utils_2.8.0         assertthat_0.2.1     
 [7] BiocManager_1.30.4    affy_1.60.0           xlsxjars_0.6.1       
[10] impute_1.56.0         pillar_1.3.1          lattice_0.20-38      
[13] glue_1.3.1            limma_3.38.3          digest_0.6.18        
[16] promises_1.0.1        colorspace_1.4-1      htmltools_0.3.6      
[19] httpuv_1.5.1          preprocessCore_1.44.0 Matrix_1.2-17        
[22] R.oo_1.22.0           plyr_1.8.4            MALDIquant_1.19.2    
[25] pkgconfig_2.0.2       zlibbioc_1.28.0       purrr_0.3.2          
[28] xtable_1.8-3          scales_1.0.0          affyio_1.52.0        
[31] later_0.8.0           BiocParallel_1.16.6   tibble_2.1.1         
[34] IRanges_2.16.0        ggplot2_3.1.1         lazyeval_0.2.2       
[37] magrittr_1.5          crayon_1.3.4          mime_0.6             
[40] R.methodsS3_1.7.1     fs_1.2.7              ncdf4_1.16.1         
[43] R.cache_0.13.0        doParallel_1.0.14     MASS_7.3-51.3        
[46] xml2_1.2.0            tools_3.5.0           stringr_1.4.0        
[49] munsell_0.5.0         pcaMethods_1.74.0     compiler_3.5.0       
[52] rlang_0.3.4           grid_3.5.0            RCurl_1.95-4.12      
[55] iterators_1.0.10      rstudioapi_0.10       bitops_1.0-6         
[58] gtable_0.3.0          codetools_0.2-15      curl_3.3             
[61] reshape2_1.4.3        R6_2.4.0              dplyr_0.8.0.1        
[64] stringi_1.4.3         rJava_0.9-11          tidyselect_0.2.5 
lgatto commented 5 years ago

The issue doesn't lie with the EBI server but with our different OSes.

After reproducing the error on Windows, I was able to get this to work by setting a different method to download.file like this:

pxget(px1, f, method = "libcurl")

This worked for me because I have

> capabilities("libcurl")
libcurl
   TRUE

which might not be the case on all Windows installations.

Another one that would be useful to test is method = "wininet".

See ?download.file for details.

mamut343 commented 5 years ago

Hi @lgatto

I confirmed adding method = "libcurl" works. I did try method = "wininet" but it is not working in my case. A last note is that I updated my R base from 3.5.0 to 3.5.3.

Thanks for the quick reply. Cheers