Adding database identifiers that do not map to an InterPro identifier

000generic commented 7 years ago

I have a large number of transcripts that are by annotated InterProScan but only with, for example, PANTHER identifiers and no corresponding InterPro identifier (as one does not exist - termed Unintegrated Signatures - for example, http://www.ebi.ac.uk/interpro/signature/PTHR23301) - and so the transcripts or peptides remain unannotated in TBro, despite having been annotated by InterProScan with a PANTHER id.

Is it possible to include in TBro all transcript/peptide-annotating identifiers for a given database annotation run by InterProScan - or load into TBro a mapping of transcript or peptide identifiers to database/custom identifiers - and then be able to search for the database/custom identifiers under Annotation Search?

Thank-you!

iimog commented 7 years ago

Thanks for pointing this out. I will have a closer look at the Interpro importer and try to find the reason why those lines are not imported. In general the Interpro ID should not be required for import. Would you mind sharing an example line from your interproScan result table which is not properly imported?

There is also the possibility to import custom annotations for isoforms/unigenes since TBro version 1.1.1 which are basicly arbitrary key-value pairs. In order to use them you need a tsv file with unigene or isoform identifier in the first column and value in the second column. The key is passed to the importer with --annotation-type. For example with a tsv file like this (custom.tsv):

comp001_1   XYZ
comp002_1   XXX
comp003_1   XYX

You can import with the following command:

tpro-import annotation_custom --annotation-type myid custom.tsv

You can then search via annotation search for custom annotations with key myid and value e.g. XYZ

000generic commented 7 years ago

Thanks! I think the import of custom annotations should do the trick! I'll be back with an example of the InterProScan result table that is not properly imported.

TBroTeam / TBro

Adding database identifiers that do not map to an InterPro identifier #47