vitkl / PItools

Protein interaction data tools, retrieving data from IntAct and most other databases
https://vitkl.github.io/PItools/
Apache License 2.0
3 stars 0 forks source link

Error when trying to download interaction #2

Open Rohit-Satyam opened 6 months ago

Rohit-Satyam commented 6 months ago

Dear @vitkl

Thanks for developing PItools. I am first time usier of PItools and I was trying to fetch the interactions for Plasmodium from Intact and I encountered the following error:

> pf = fullInteractome(taxid = 36329, database = "IntActFTP", # 36329 - pf taxid
+                         clean = TRUE,
+                         protein_only = TRUE,
+                         directory = "./data_files/", # NULL to keep data files inside R library - default
+                         releaseORdate = NULL) # useful to keep track of the release date e.g. 2019Mar23, but for the first download set to NULL 
... looking for the date of the latest IntAct release ...
... looking for the date of the latest IntAct release ...
... dowloading from IntAct ftp ...
Error in compressFile.default(filename = filename, ..., ext = ext, FUN = FUN) : 
  No such file: ./data_files/IntActRelease_2024Feb16/intact.txt
In addition: Warning messages:
1: In dir.create(dir_last_release) :
  cannot create dir './data_files/IntActRelease_2024Feb16', reason 'No such file or directory'
2: In download.file(url, ...) :
  URL ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psimitab/intact.zip: cannot open destfile './data_files/IntActRelease_2024Feb16/intact2024Feb16.zip', reason 'No such file or directory'
3: In download.file(url, ...) : download had nonzero exit status
4: In unzip(file_name.zip, exdir = dir) :
  error 1 in extracting from zip file

The above error disappears when I remove directory = "./data_files/" but then another error surfaces:

> pf = fullInteractome(taxid = 36329, database = "IntActFTP", # 36329 - pf taxid
+                         clean = TRUE,
+                         protein_only = TRUE,
+                         releaseORdate = NULL) # useful to keep track of the release date e.g. 2019Mar23, but for the first download set to NULL 
... looking for the date of the latest IntAct release ...
... looking for the date of the latest IntAct release ...
... dowloading from IntAct ftp ...
trying URL 'ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psimitab/intact.zip'
Content type 'unknown' length 618183230 bytes (589.5 MB)
==================================================
|--------------------------------------------------|
|==================================================|
trying URL 'https://www.uniprot.org/taxonomy/?query=ancestor:36329&format=tab'
Error in download.file(url, method = method, ...) : 
  cannot open URL 'https://www.uniprot.org/taxonomy/?query=ancestor:36329&format=tab'
In addition: Warning messages:
1: In fread(file_name, header = T, stringsAsFactors = F) :
  Found and resolved improper quoting out-of-sample. First healed line 25417: <<uniprotkb:P16054    uniprotkb:Q05769    intact:EBI-298451|ensembl:ENSMUSP00000094873.3|ensembl:ENSMUSP00000094874.3 intact:EBI-298933|uniprotkb:Q543K3|ensembl:ENSMUSP00000035065.8 psi-mi:kpce_mouse(display_long)|uniprotkb:Prkce(gene name)|psi-mi:Prkce(display_short)|uniprotkb:Pkce(gene name synonym)|uniprotkb:Pkcea(gene name synonym)|uniprotkb:nPKC-epsilon(gene name synonym)   psi-mi:pgh2_mouse(display_long)|uniprotkb:Ptgs2(gene name)|psi-mi:Ptgs2(display_short)|uniprotkb:Cox-2(gene name synonym)|unipro>>. If the fields are not quoted (e.g. field separator does not appear within any field), try quote="" to avoid this warning.
2: In download.file(url, method = method, ...) :
  cannot open URL 'https://rest.uniprot.org/taxonomy/query=ancestor:36329&format=tab': HTTP status was '400 Bad Request'

I see that IntAct provides the interactions here but in miXML 3.0 format which I don't know how to read in R. So any level of help from your side is appreciated.

vitkl commented 6 months ago

I see. I also don’t know how to read that format in R - except that there are packages to read XML as nested lists. You would need to figure out how to convert XML into the mi-tab style old format. You could contact EBI/IntAct support to ask them how to convert the data.

On Thu, 21 Mar 2024 at 11:31, Rohit Satyam @.***> wrote:

Dear @vitkl https://github.com/vitkl

Thanks for developing PItools. I am first time usier of PItools and I was trying to fetch the interactions for Plasmodium from Intact and I encountered the following error:

pf = fullInteractome(taxid = 36329, database = "IntActFTP", # 36329 - pf taxid+ clean = TRUE,+ protein_only = TRUE,+ directory = "./data_files/", # NULL to keep data files inside R library - default+ releaseORdate = NULL) # useful to keep track of the release date e.g. 2019Mar23, but for the first download set to NULL ... looking for the date of the latest IntAct release ...... looking for the date of the latest IntAct release ...... dowloading from IntAct ftp ...Error in compressFile.default(filename = filename, ..., ext = ext, FUN = FUN) : No such file: ./data_files/IntActRelease_2024Feb16/intact.txtIn addition: Warning messages:1: In dir.create(dir_last_release) : cannot create dir './data_files/IntActRelease_2024Feb16', reason 'No such file or directory'2: In download.file(url, ...) : URL ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psimitab/intact.zip: cannot open destfile './data_files/IntActRelease_2024Feb16/intact2024Feb16.zip', reason 'No such file or directory'3: In download.file(url, ...) : download had nonzero exit status4: In unzip(file_name.zip, exdir = dir) : error 1 in extracting from zip file

The above error disappears when I remove directory = "./data_files/" but then another error surfaces:

pf = fullInteractome(taxid = 36329, database = "IntActFTP", # 36329 - pf taxid+ clean = TRUE,+ protein_only = TRUE,+ releaseORdate = NULL) # useful to keep track of the release date e.g. 2019Mar23, but for the first download set to NULL ... looking for the date of the latest IntAct release ...... looking for the date of the latest IntAct release ...... dowloading from IntAct ftp ...trying URL 'ftp://ftp.ebi.ac.uk/pub/databases/intact/current/psimitab/intact.zip'Content type 'unknown' length 618183230 bytes (589.5 MB)==================================================|--------------------------------------------------||==================================================|trying URL 'https://www.uniprot.org/taxonomy/?query=ancestor:36329&format=tab'Error in download.file(url, method = method, ...) : cannot open URL 'https://www.uniprot.org/taxonomy/?query=ancestor:36329&format=tab'In addition: Warning messages:1: In fread(file_name, header = T, stringsAsFactors = F) : Found and resolved improper quoting out-of-sample. First healed line 25417: <<uniprotkb:P16054 uniprotkb:Q05769 intact:EBI-298451|ensembl:ENSMUSP00000094873.3|ensembl:ENSMUSP00000094874.3 intact:EBI-298933|uniprotkb:Q543K3|ensembl:ENSMUSP00000035065.8 psi-mi:kpce_mouse(display_long)|uniprotkb:Prkce(gene name)|psi-mi:Prkce(display_short)|uniprotkb:Pkce(gene name synonym)|uniprotkb:Pkcea(gene name synonym)|uniprotkb:nPKC-epsilon(gene name synonym) psi-mi:pgh2_mouse(display_long)|uniprotkb:Ptgs2(gene name)|psi-mi:Ptgs2(display_short)|uniprotkb:Cox-2(gene name synonym)|unipro>>. If the fields are not quoted (e.g. field separator does not appear within any field), try quote="" to avoid this warning.2: In download.file(url, method = method, ...) : cannot open URL 'https://rest.uniprot.org/taxonomy/query=ancestor:36329&format=tab': HTTP status was '400 Bad Request'

I see that IntAct provides the interactions here https://www.ebi.ac.uk/intact/interactomes but in miXML 3.0 format which I don't know how to read in R. So any level of help from your side is appreciated.

— Reply to this email directly, view it on GitHub https://github.com/vitkl/PItools/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFMFTV62BBI2OD5TY3BW6ULYZLAKPAVCNFSM6AAAAABFBIL3Z2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGE4TSOJTGIYTSNQ . You are receiving this because you were mentioned.Message ID: @.***>