compomics / meta-proteome-analyzer

MetaProteomeAnalyzer (MPA) software for analyzing and visualizing MS-based metaproteomics data.
27 stars 10 forks source link

LCA issues #14

Closed FlorenceAbram closed 6 years ago

FlorenceAbram commented 6 years ago

When I'm getting my results after selecting the common taxonomy rules, there appears to be an issue when there are multiple Uniprot hits for a single metaprotein. Sometimes it will display the common ancestor and other times (more frequently) it will only display the strain of the last Uniprot accession number in the list. Examples below of two instances where the wrong taxonomy is reported and then when the right LCA is reported:

Meta-Protein No. | Meta-Protein Accession | Meta-Protein Description | Meta-Protein Taxonomy | Superkingdom | Kingdom | Phylum | Class | Order | Family | Genus | Species | Meta-Protein KO | Meta-Protein EC | Proteins

# here is an example where given the 5 accession numbers only the last one is considered
266 | Meta-Protein 115 | TIG_STRZJ Trigger factor OS=Streptococcus   pneumoniae (strain JJA) GN=tig PE=3 SV=1 | Streptococcus pneumoniae TIGR4 | Bacteria | Unknown | Firmicutes | Bacilli | Lactobacillales | Streptococcaceae | Streptococcus | Streptococcus pneumoniae | K03545\| | 5.2.1.8\| | C1CCG7, B9E686, Q1WU82, B1I9J5, Q97SG9

# this one appear to get the correct LCA
336 | Meta-Protein 142 | ATPA_BACCN ATP synthase subunit alpha   OS=Bacillus cytotoxicus (strain DSM 22905 / CIP 110041 / 391-98 / NVH 391-98)   GN=atpA PE=3 SV=1 | Bacillus | Bacteria | Unknown | Firmicutes | Bacilli | Bacillales | Bacillaceae | Bacillus | Unknown | \|K02111\| | 3.6.3.14\| | A7GV58, Q814W0

# this one report an intermediate LCA only taking into account the 2 last hits even though the first hit is also a cyanobacteria
337 | Meta-Protein 143 | RPOC1_PROM4 DNA-directed RNA polymerase subunit gamma OS=Prochlorococcus   marinus (strain MIT 9211) GN=rpoC1 PE=3 SV=1 | Cyanobacteria | Bacteria | Unknown | Cyanobacteria | Unknown | Unknown | Unknown | Unknown | Unknown | K03046\| | 2.7.7.6\| | A9BCH5, B0VLZ5, Q3M5C9, B7JWQ8

Is there anyway to change this behaviour as my only option at the moment is to go manually and check the LCA for all the MP with multiple Uniprot accession numbers.

Thank you very much for your help with this.

thilus commented 6 years ago

Thanks for reporting this issue about the wrong taxonomy for the meta-proteins. We will check that, I remember that a similar problem has been reported before.

FlorenceAbram commented 6 years ago

Dear Thilo, This has been fixed now thank you very much. Best wishes, Florence

On 21 Dec 2017, at 14:09, Thilo Muth notifications@github.com<mailto:notifications@github.com> wrote:

Thanks for reporting this issue in detail! We will check that, I remember that a similar problem has been reported before.

Best regards!

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/compomics/meta-proteome-analyzer/issues/14#issuecomment-353359021, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AblFB1wzFPTZMnWGQ2NkRCmFxHKZ6WXyks5tCmXhgaJpZM4Qkezk.

thilus commented 6 years ago

Perfect, thank you, Florence.

All the best, Thilo