Closed minhasbushra closed 2 years ago
Hi Bushra,
do you mean that the command
wget https://v100.orthodb.org/download/odb10_vertebrata_fasta.tar.gz
is not working or that it is not working when you modify 10
to 10.1
?
I've checked the difference between OrthoDB 10
and 10.1
before and my conclusion was that the eukaryotic protein sequences are identical between v10
and v10.1
(prokaryotic proteins did change). Some of the eukaryotic proteins are formatted differently in v10.1
, but this should not affect ProtHint. For this reason, it should be fine to keep using the v10
link. Obviously, I could be wrong in my analysis, so please double-check.
Best, Tomas
thanks for your reply. The command that I wrote worked, but it didn't work with the new version 10.1. Also, I have a question, I am working on fish so taking "Vertebrata" from OrthoDb. Would it be good to add additional close relative fish protein sequences along with the orthoDB vertebrate? (the closely related species is also in orthodb list).
Thanks
Would it be good to add additional close relative fish protein sequences along with the orthoDB vertebrate?
Yes, that would make sense, if they are not already covered by OrthoDB.
I just saw a paper that used BRAKER2 and a similar strategy of protein preparation for a fish annotation (with good results):
For protein evidence, manually annotated and reviewed protein records from UniProtKB/Swiss-Prot (UniProt Consortium, 2021) as of January 11, 2021 (563,972 sequences) in addition to the proteomes of the false clownfish (A. ocellaris: 48,668), zebrafish (Danio rerio: 88,631), spiny chromis damselfish (Acanthochromis polyacanthus: 36,648), Nile tilapia (Oreochromis niloticus: 63,760), Japanese rice fish (Oryzias latipes: 47,623), rainbow fish (Poecilia reticulata: 45,692), bicolor damselfish (Stegastes partitus: 31,760), tiger puffer (Takifugu rubripes: 49,529), and Atlantic salmon (Salmo salar: 112,302) from the NCBI protein database (https://www.ncbi.nlm.nih.gov/protein) were used.
https://www.biorxiv.org/content/10.1101/2022.01.16.476524v1
They used UniProtKB/Swiss-Prot as the large protein source, but OrthoDB should work just as well, if not better.
Tomas
Thanks !!
Hi,
With newer version of orthodb 10.1, the option "wget https://v100.orthodb.org/download/odb10_vertebrata_fasta.tar.gz" is not working for preparing protein database. I have tried with modified link but its not working. Any suggestions. ?
Thanks Bushra