merenlab / anvio

An analysis and visualization platform for 'omics data
http://merenlab.org/software/anvio
GNU General Public License v3.0
413 stars 142 forks source link

[FEATURE REQUEST] metabolic enrichment for enzymes #2268

Closed wsmets closed 1 month ago

wsmets commented 1 month ago

The need

I think anvi-compute-metabolic-enrichment is really great! I just think KEGGs definitions of module functions has its limitations. I did notice that when I searched through my KEGG modules, I didn't find all the functions I was expecting. Indeed when I dived deeper into the example of production of the C30 carotenoid 4,4′-diaponeurosporene, it is not included in a KEGG module, but the relevant enzymes (EC:2.5.1.96, EC:1.3.8.2) are in my "anvi-estimate-metabolism --output-modes hits" output file.

The solution

There might be another way around this, but the best one I can think of is to allow the creation of an "anvi-estimate-metabolism --output-modes hits"-like output file that has enzymes and can be used in anvi-compute-metabolic-enrichment. However, I am unsure of the programming and statistical implications. Maybe filter enzymes for a minimum presence threshold in samples/groups?

Beneficiaries

People who look at functions that are not defined in KEGG modules.

ivagljiva commented 1 month ago

Have you tried this program? https://anvio.org/help/main/programs/anvi-compute-functional-enrichment-across-genomes/

It seems like it's exactly what you are looking for (using KOfam as the annotation source would use the exact set of enzymes that are summarized in the hits mode output from anvi-estimate-metabolism)

wsmets commented 1 month ago

Oops, yes that's it! Thanks!