gobics / uproc

Tools for ultra-fast protein sequence classification.
http://uproc.gobics.de/
GNU Lesser General Public License v3.0
5 stars 3 forks source link

Incomplete Pfam family accessions in UProC version of Pfam-27 #15

Open sjaenick opened 9 years ago

sjaenick commented 9 years ago

The UProC version of Pfam27 on ftp://projects.gobics.de/uproc/db/ uses incomplete accession numbers for protein families, e.g. PF00045 instead of PF00045.14; this is somewhat unfortunate, as the shortened accession does not suffice to fetch the complete model (and it's description) from the original HMM database with hmmfetch:

[sjaenick@fozzie:~ ]$ hmmfetch /vol/biodb/pfam27/Pfam-A.hmm PF00045 |grep DESC

Error: HMM PF00045 not found in SSI index for file /vol/biodb/pfam27/Pfam-A.hmm

[sjaenick@fozzie:~ ]$ hmmfetch /vol/biodb/pfam27/Pfam-A.hmm PF00045.14 |grep DESC DESC Hemopexin

Pfam 28 is available since May 2015, so it would be nice to have a UProc version of Pfam-28 with full accession numbers.