Proteomicslab57357 / UniprotR

Retrieving Information of Proteins from Uniprot
GNU General Public License v3.0
59 stars 18 forks source link

Uniprot FASTA database download failed #23

Closed wardiam closed 2 years ago

wardiam commented 2 years ago

With the update of the Uniprot interface, it is no longer possible to download databases as before.

See information here: https://www.uniprot.org/help/api_queries

Most relevant is that you can only use the "direct" download for proteomes with less than 5 million sequences. For larger proteomes (e.g. Firmicutes (taxid = 1239, has 21 million entries) you have to use the new paging system.

Could you implement the new Uniprot FASTA file download system in the next release, please?

Thank you very much. Wardiam

AliYoussef96 commented 2 years ago

Hi @wardiam,

Thank you for using our package. We are working on updating and fixing bugs related to the new API updates in the Uniprot database. We are going to consider this in the latest version.

wardiam commented 2 years ago

Thank you very much @AliYoussef96,

I think we are all the same. This change in the Uniprot interface and its new API has caused quite a few changes in the way we get the information. I hope you will be able to fix all the related bugs soon.

Could you let me know when it's all fixed, please?.

Thank you very much for all your work and thanks for your UniprotR library, it is very useful for me.

Best regards, Wardiam

MohmedSoudy commented 2 years ago

Dear Dr. @wardiam Thank you for your interest in using our package. As my colleague, Ali mentioned the new update caused some technical issues. We update the package to resolve your problem and a new version UniprotR 2.2.1 is on its way to CRAN.

wardiam commented 2 years ago

Dear @MohmedSoudy,

Thank you very much for letting me know about the new release. I have uninstalled and tried to install the new UniprotR release but I get this error and the library is not installed:

ERROR: dependency 'GenomicAlignments' is not available for package 'alakazam'.
* removing '/cloud/lib/x86_64-pc-linux-gnu-library/4.2/alakazam'.

I managed to install GenomicAlignments from Bioconductor but still the installation of alakazam keeps giving me error.

Could you help me, please?.

Thanks, Wardiam

wardiam commented 2 years ago

I have tried on another computer with Windows 10 and the same problem occurs:

* Installing 'UniprotR' package ...
** package 'UniprotR' successfully unpacked and MD5 sums checked
** using staged installation
** R
** Inst
** byte compilation and preparation of the package for slow loading
Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]) : 
  no package named 'GenomicAlignments'.
Calls: <Anonymous> ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Interrupted execution
ERROR: lazy loading failed for package 'UniprotR'.
* removing 'C:/Users/Virgo/AppData/Local/R/win-library/4.2/UniprotR'.
Warning in install.packages :
  'UniprotR' package installation had a non-zero exit status.

Wardiam

MohmedSoudy commented 2 years ago

The issue is similar to #14. It's due to dependency "alakazam" version 1.1.0 while UniprotR requires alakazam version 1.0.0. This problem will be solved in the next version of UniprotR and can be solved now by installing alakazam before installing UniprotR using install.packages("https://cran.r-project.org/src/contrib/Archive/alakazam/alakazam_1.0.0.tar.gz", repo=NULL, type="source")

wardiam commented 2 years ago

Thank you @MohmedSoudy

I have managed to install UniprotR. One more question, I'm trying to download a small proteome but I can't get it in any way. I use this code:

GetProteomeFasta("UP000032801", getwd())

And I get this error: "Internal server error. Most likely a temporary problem, but if the problem persists please contact us." Could you help me, please?.

Last question, could the FASTA file be downloaded instead of the ProteomeID by the taxonID? For me it would be more useful this second option.

Thanks for everything. Wardiam

MohmedSoudy commented 2 years ago

Thank you @wardiam For the first problem, I tried the same code using UniprotR version 2.2.1 and It works fine. I think it's a temporary problem. For the second question, We will work on that feature to download the Proteome using the taxonomy ID rather than the Proteome ID and It will be available in the next version. Hope that we could help you.

wardiam commented 2 years ago

Thank you @MohmedSoudy ,

I have tried again but the "Internal server error" still continues from my computer. I will try tomorrow and get back to you. Regarding the inclusion of the taxonID download, this is great news, I will wait for the next release.

Thanks for your help and for considering my request.

Best regards, Wardiam

MohmedSoudy commented 2 years ago

Sure. Please update us whether the problem is resolved or the error still continues.

wardiam commented 2 years ago

Good evening @MohmedSoudy ,

I have tested the proteome download feature and it works fine on my Windows 10 computer but on my linux server with Ubuntu 20.04.4 LTS it does not work. I have also tried on Rtudio Cloud server ( which also has Ubuntu) and it does not work either. Until you check it for now at least I can use it on Windows ;-).

Thanks for everything. Wardiam

MohmedSoudy commented 2 years ago

Good evening @wardiam It's great that the function is working now on windows, However, we are working on solving the problem on ubuntu.