ebi-pf-team / interproscan

Genome-scale protein function classification
Apache License 2.0
302 stars 67 forks source link

Is it possible to extract representative domains from an interpro-scan run? #340

Closed steven-pilger closed 9 months ago

steven-pilger commented 1 year ago

According to this tweet, representative domains are now displayed on the protein page. I tried running interproscan-5.64-96.0 on a sequence that has representative domains in the EBI hosted version of interpro, but could not find a corresponding attribute in the generated output.xml file from my run using my local interpro.

Could you point me to some documentation that explains how these representative domains are selected, and how I could extract them from an interpro-scan run? Thanks!

matthiasblum commented 1 year ago

Hi @steven-pilger,

Representative domains are not yet available for InterProScan results. We'll soon publish a standalone script to extract these domains for InterProScan results.

steven-pilger commented 1 year ago

I just saw on twitter that representative domain information is now part of the REST API responses, thats super cool! 👍🏼 Any update on the standalone script @matthiasblum?

matthiasblum commented 1 year ago

Hi @steven-pilger,

We want representative domains to be visible in InterProScan results on the InterPro website as well, so I am thinking about selecting representative domains directly in InterProScan, but that takes more time. I hope we'll have that ready for the next release of InterPro (mid January 2024).

steven-pilger commented 9 months ago

Hi @matthiasblum, any update on the representative domain inclusion? Thanks!

matthiasblum commented 9 months ago

Hi @steven-pilger,

Representative domains are now reported in the InterProScan XML and JSON output files. We added a representative boolean attribute/property which informs whether the match is a representative domain.