debbiemarkslab / EVcouplings

Evolutionary couplings from protein and RNA sequence alignments
http://evcouplings.org
Other
236 stars 76 forks source link

SIFTS #298

Open t9lex opened 1 year ago

t9lex commented 1 year ago

I have been downloading the SIFTS dataset for two days now in order to calculate the contact matrix through multiple sequence alignment. How long does it usually take to download this dataset?

thomashopf commented 1 year ago

It can take quite a while to download this (on the order of hours), as lots of sequences need to be downloaded using the UniProt API.

Is this using the latest code from the develop branch?

t9lex commented 1 year ago

I understand that you are able to download the CSV file, but encountering an error when trying to download the FA file. I apologize for any confusion earlier.

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/luoqichao/Liyingcan/evcouplings/EVcouplings-develop/databases/sifts/shuju.py", line 13, in s = SIFTS("../../databases/sifts/pdb_chain_uniprot_plus.csv", "../../databases/sifts/pdb_chain_uniprot_plus.fa") File "/home/luoqichao/miniconda3/lib/python3.9/site-packages/evcouplings/compare/sifts.py", line 355, in init self.create_sequence_file(sequence_file) File "/home/luoqichao/miniconda3/lib/python3.9/site-packages/evcouplings/compare/sifts.py", line 539, in create_sequence_file raise ResourceError( evcouplings.utils.system.ResourceError: Could not fetch sequences for SIFTS mapping tables from UniProt since maximum number of retries after connection errors was exceeded. Retry at a later time, or call SIFTS.create_sequence_file() with a higher value for max_retries.

t9lex commented 1 year ago

Yesterday I used the nohup command to download fast data, and it started downloading but stopped at 30M and didn't proceed further.

t9lex commented 1 year ago

I saw a comment saying that fasta data can also be directly downloaded from somewhere. Can you please provide me with the website? I would like to try this method.