jlw-ecoevo / gRodon2

38 stars 3 forks source link

growthPredict trouble shooting #4

Closed sheaster closed 6 months ago

sheaster commented 2 years ago

Hi there, really think this program is awesome! I have a few small questions. 1) sometimes my protein annotation will return two hits for a open reading frame. Would including both in the input (the "genes" from readDNAStringSet(path_to_genome)) bias the results?

2) I find that modes for metagenome v1 and v2 sometimes return very different results (hours different). Is this to be expected?

3) I see in the warnings, "Estimated doubling time >5 hours. CUB signal saturates at approx. 5 hrs... gRodon may underestimate doubling times above this range. Consider simply reporting as '>5hrs'. (In other words, this microbe definitely grows slowly, but we can't tell you quite how slowly)." Many doubling times for my data are >5 hours. Should they be reported as >5 hours or are the differences between them significant?

Thank you very much! Shea

jlw-ecoevo commented 2 years ago

Sorry I missed this!

  1. I doubt that you will see much difference including or excluding them unless it is happening a lot with ribosomal proteins (but you could check this yourself)
  2. Yes - please use v2 in these cases (v1 will give biased results: https://doi.org/10.1101/2022.04.12.488109)
  3. gRodon tends to underestimate over 5 hours, though it still captures overall trends in doubling time (see fig 1a in the original paper, the relationship flattens out quite a bit after 5 hrs). So, intercomparisons of things with d>5hrs may in some cases give interesting info (but nothing in this regime is benchmarked or well understood), but you definitely do not want to report these doubling times as if they are representative of the actual doubling time (you could maybe think of them as a lower bound)
sheaster commented 2 years ago

Thank you for your responses!

With respect to 3: I see that in the preprint for the "Benchmarking community-wide estimates of growth potential from metagenomes using codon usage statistics" paper, Figure 5 shows "a distinct decrease in the community-wide maximum growth rate with depth after 100 meters".

The reported MMv2 recovered average minimum doubling times which contribute to the average curve plotted are >5 hours below 100 meters. Is this an example of how the growthPredict outputs can correctly be used to capture overall trends in doubling time? Is this only valid for communities using MMv2? Could a similar plot of many bins varying over some parameter be meaningful given the caveats you mention above? Thanks!

jlw-ecoevo commented 2 years ago

Yes - though those changes with depth are in a large part driven by changes in temperature. This is forthcoming, but I am currently working to redefine the cutoff used by gRodon to be set in terms of CUB rather than doubling time (i.e., the relationship levels off after CUB drops below some threshold). This is because organisms with slow growth due to differences in temperature may still be in a regime where CUB is optimized. These cutoffs will be redefined in a new version of the eukaryotic gRodon preprint that will hopefully be out in the next month or so, but in general as long as CUBHE>0.59 for prokaryotes predictions should be reliable.

jlw-ecoevo commented 2 years ago

Also just to be clear - I strongly recommend shifting to MMv2 for all metagenomic prediction

sheaster commented 2 years ago

Thanks again for your replies! To clarify, am I correct in understanding that the regular GrowthPredict mode (and partial depending on completeness) can be used for bins/mags, while MMv2 should be used for whole metagenomes? Thank you and looking forward to the next version released!

jlw-ecoevo commented 1 year ago

Yes - unless you suspect there's contamination in the bins/mags in which case MMv2 is probably a safer option for those as well.

oduwoleiyanu commented 1 year ago

Hi, is there a way we can run gRodon on multiple genomes at once directly from R? I don't think that one is included in the documented example.

Thank you!