snayfach / MicrobeCensus

MicrobeCensus estimates the average genome size of microbial communities from metagenomic data
http://genomebiology.com/2015/16/1/51
GNU General Public License v3.0
41 stars 16 forks source link

question: calculation of RPKG #26

Closed jzrapp closed 4 years ago

jzrapp commented 4 years ago

Hi @snayfach ,

I've been going back and forth between normalization methods for my data, and wasn't sure whether to trust the microbecensus results for some of my samples (very large AGS estimates, but also probably a high number of viruses). I finally decided to move forward with it, but have a question concerning the calculations:

I'm trying to use your suggested RPKG normalization for my KO abundance table. I'm not sure how to best do step 2 of this equation though. Is there a list of gene lengths for each KO? If so, where can I find it?
"The RPKG of a KO in a metagenome was computed by: 1) counting the number of reads mapped to the KO; 2) dividing (1) by the length of the KO in kilobase pairs; and 3) dividing the result of (2) by the number of sequenced genome equivalents.

Thanks a lot!

snayfach commented 4 years ago

You can replace the term KO with reference gene. This is just the length of the gene in your reference database a read mapped to. Only you have that information :)