AnantharamanLab / METABOLIC

A scalable high-throughput metabolic and biogeochemical functional trait profiler
172 stars 42 forks source link

biogeochemical cycles - coverage calculation #100

Open ewelinarubin opened 2 years ago

ewelinarubin commented 2 years ago

I have difficulty figuring out how is the coverage calculation done for the N, C, and S cycles. Here is an example I have 38 MAGs that I input as a community and 6 genomes listed as having the ability to fix nitrogen (they have the nitrogenase nifH gene) I run a METABOLIC-C with sequence reads from a sample where I suspect those six genes are not present. The "All_gene_collections_mapped.depth" correct says zero mapping for all nifH genes.

However, the N-cycle diagram gives coverage of 75.47% (the same as the totalR.txt file) for the Nitrogen fixation. I want to know how I get that number and what it means.

I see the publication this

"The abundance percentage indicates the relative abundance of microbial genomes that contain the specific gene components of a biogeochemical cycling step among all microbial genomes in a given community (Fig. 2) [2]."

so show is that relative abundance estimated?

ChaoLab commented 2 years ago

Yes, that is the definition of abundance percentage. nifH gene is one of the genes for the function of nitrogen fixation. Probably the relative abundance percentage value is based on the other nitrogen fixation genes