ebi-pf-team / interproscan

Genome-scale protein function classification
Apache License 2.0
303 stars 67 forks source link

Results differ between web-interface and interproscan.sh #349

Closed nschan closed 9 months ago

nschan commented 10 months ago

Hello,

I noticed that the results for some sequences when run in web-interface differ compared to local runs from a fasta file. For example this sequence:

>sequence1
MGSAMSLSCSKRKATSQDVDSESCKRRKICSTNDAENCIFIPDESSWSLCANRVISVAAVALTNFRFQQDNQESNSSSLSLPSPATSVSRNWKHDVFPSFHGADVRRTFLSHIMESFRRKGIDTFIDNNIERSKSIGPELKKAIKGSKIAIVLLSRKYASSSWCLDELAEIMKCREVLGQIVMTIFYEVEPTDIKKQTGEFGKAFTKTCRGKTKEHIERWRNALEDVATIAGYHSHKWRNEADMIEKIATDVSNMLNSCTPSRDFDGLVGMRAHMNMMEHLLRLDLDEVRIIGIWGPPGIGKTTIARFLLNQVSDRFQLSAIMVNIKGCYPRPCFDEYSAQLQLQNQMLSQMINHKDIMISHLGVAQERLRDKKVFLVLDEVDQLGQLDALAKETRWFGPGSRIIITTEDLGVLKAHGINHVYKVGYPSNDEAFQIFCMNAFGQKQPHEGFDEIAREVMALAGELPLGLKVLGSALRGKSKPEWERTLPRLRTSLDGKIGSIIQFSYDALCDEDKYLFLYIACLFNGESTTKVKELLGKFLDVRQGLHVLAQKSLISFHEEISCKQIVQVLLLNKFSHVRHTKRNNSQIIRMHTLLEQFGRETSRKQFVHHGYRKHQLLVGERDICEVLDDDTTDNRRFIGINLDLYKNEEELNISEKALERIHDFQFVKINDVFTHQPERQKLEDLIYHSPRIRSLKWFPYRNICLPSTFNPEFLVELDMSCSKLRKLWEGTKQLRNLKWMDLSNSRYLKELPNLSTATNLEELKLRNCSSLVELPSSIEKLTSLQILDLRDCSSLVKLPPSINANNVQGLSLTNCSRVVKLPAIENVTNLHQLKLQNCSSLIELPLSIGTANNLWKLDIRGCSSLVKLPSSIGDMTNLKEFDLSNCSNLVELPSSIGNLQKLFMLRMRGCSKLETLPTNINLISLRILDLTDCSQLKSFPEISTHISELRLKGTAIKEVPLSITSWSRLAVYEMSYFESLKEFPHALDIITYLLLVSEDIQEVPPWVKRMSRLRVLTLNNCNNLVSLPQLPDSLDHIYADNCKSLERLDCCFNNPEIRLYFPKCFKLNQEARDLIMHTSTRKYAMLPSIQVPACFNHRATSGDSLKIKLKESSLPTTLRFKACIMLVKVNEEMRDDEMWPSVLIAIRVKQNDLKVLCTASIYPVLTEHIYTFELEVEEVTSTELVFEFTPFLKSNWKIGECGILQRETRSLRRSSSPDLSPESSRAFSLSHSPLLSLCLMDWLMTRFLLMGFRCVSSCDHCL

gets annotated by Interproscan on the web interface (default settings) with Pfams:

PF01582 
PF07725
PF00931
PF20160

This seems good and is pretty much what I am expecting

However, when running the same sequence through interproscan from command line (with -f TSV -app Pfam) as part of a larger fasta file it is only annotated as: PF00004

This is an issue for me because I would like to have complete annotations for the genes, and it is kind of unclear to me why this happens. How can I get interproscan.sh to give me the same output as the web interface?

Additional information: I have installed interproscan@5.66-98.0 via spack

nschan commented 9 months ago

This issue was resolved by downloading the full interproscan data and adding it to the spack installation. Appearently the data that comes with the spack package is incomplete.