bg7 / BG7

bacterial genome annotation system
bg7.ohnosequences.com
13 stars 7 forks source link

Adding InterproScan annotation #20

Closed rtobes closed 11 years ago

rtobes commented 12 years ago

Currently the InterPro motifs included in the annotation are motifs that have the annotator proteins. It would be more precise to add motifs detected in the real sequences of the predicted genes using InterProScan. It is an open tool deeply related to Uniprot that we could incorporate to the refined annotation process. It probably has a computational cost that we have to analyze.

eparejatobes commented 12 years ago

The current production-ready interproscan (v4.x) is a nightmare to install/run standalone, let alone output parsing etc. However, future looks brighter, as the not-yet-released v5 looks much better -in all respects.

it looks like development is quite active, and there's a beta test for a stand-alone version with a sane setup process!

I think that we should start development based on this. Opinions?

pablopareja commented 12 years ago

I'm not sure it'd be a good idea incorporating something which is in beta version yet... It's more than common running into bioinformatics buggy apps/code even when they're supposed to be already fully tested, so I don't wanna imagine what you could find when dealing with things in beta :P

eparejatobes commented 12 years ago

Again, this is an improvement; something for future releases. What I'm proposing here is with respect to future development.

rtobes commented 12 years ago

It's a future development that will be incorporated after the incorporation of the refinement of the final annotation (by means a new BLAST against all bacterial proteins only for the predicted genes)

rtobes commented 11 years ago

Probably InterProscan wouldn't add many functional data to the predicted genes since Interpro motifs are extracted from Uniprot proteins and all Uniprot proteins are scanned with InterProScan when they enter in the database. If we already annotate the genes by similarity with Uniprot proteins and then infer the InterPro motif associated functions, the direct detection of InterPro motifs in our genes probably will not add many additional functional information.

I think that for now it is not worth it.