raeslab / omixer-rpm

A Reference Pathways Mapper for turning metagenomic functional profiles into pathway/module profiles.
Other
24 stars 8 forks source link

Calculation of module abundance when coverages of the module are the same in different ways of calculation #8

Closed BangzhuoTongUU closed 1 year ago

BangzhuoTongUU commented 1 year ago

Hi, thank you so much for develop and maintain this tool. And I have a question regarding on the calculation of module abundance. Say there is a module M, it includes two steps for a complete process. In the definition style of GBM database, it might be defined as:

/// M Some useful module K123 NOG456 K789 ///

So the abundance of M can be calculated from either one of:

  1. Average(K123, K789)
  2. Average(NOG456, K789) depending of whichever calculation has higher coverage

My question is:

Look forward to your reply! Thanks!

Best,

Ben

omixer commented 1 year ago

Hi Ben,

For question 1, if the coverage is equal, the pathway with the highest abundance is selected, as explained in the last section of the description https://github.com/raeslab/omixer-rpm#description.

For question 2, Omixer-RPM would not know about this. This will generate 2 duplicated paths with the same coverage and abundance. Omixer-RPM will select one of them randomly if they qualify as best output. In this case, the ortholog abundance won't be counted twice, if this is your question.

Please do not hesitate to ask if you have any further questions!

Best regards, Youssef