raeslab / omixer-rpm

A Reference Pathways Mapper for turning metagenomic functional profiles into pathway/module profiles.
Other
24 stars 8 forks source link

Missing module IDs #7

Open AlfonsEdbom opened 1 year ago

AlfonsEdbom commented 1 year ago

Hi!

I have mapped my sequencing reads against the IGC database (nucleotide sequences), and wanted to run Omixer-RPM using the GBM Database. But I noticed that there are some "genes" that are present in the GBM Database that are not found in the IGC - these include "genes" that are a required "step" in at least one GBM-module.

How should I go about finding/creating a database to map my reads to that contains all relevant information for all modules present in the GBM-database?

Below is a list of "genes" that are required "steps" in a pathway but is not present in the IGC 9.9M "IGC annotation and occurrence frequency summary table":

King regards, Alfons

MireiaVallescolomer commented 1 year ago

Hi Alfons,

This is indeed because there are no genes with these annotations in the 10M IGC. A possible alternative with a more up to date database is to use the KO annotations you can get in HUMAnN3 (https://github.com/biobakery/biobakery/wiki/humann3) and then run omixer-rpm on those to compute GBM coverage and abundance.

Cheers,

Mireia

AlfonsEdbom commented 1 year ago

Hi!

Thank you for your response! I am now able to get the missing KO annotations. However, I still cannot find the missing NOG- annotations by translating the HUMAnN3-output into eggNOG-annotations with humann_regroup_table using the latest version of their database for mapping uniref90 to eggNOGs (https://github.com/biobakery/humann/blob/master/humann/data/misc/map_eggnog_uniref90.txt.gz), since none of these annotations are found in this database. Do you know of a different version of the database that contains these eggNOG annotations or if there is a way to translate these missing eggNOGs into another annotation (like Keggs or UniRef90)

Kind regards, Alfons Edbom

MireiaVallescolomer commented 1 year ago

Hi Alfons,

This is probably because of different eggNOG versions: we used version 3.0. Translation to Uniref90 IDs is definitely possible, while eggNOG annotations were used when no suitable KO was found. We'll update you when a more updated release of the GBM database is available.

Best,

Mireia