Closed sheaster closed 6 months ago
Sorry I missed this!
Thank you for your responses!
With respect to 3: I see that in the preprint for the "Benchmarking community-wide estimates of growth potential from metagenomes using codon usage statistics" paper, Figure 5 shows "a distinct decrease in the community-wide maximum growth rate with depth after 100 meters".
The reported MMv2 recovered average minimum doubling times which contribute to the average curve plotted are >5 hours below 100 meters. Is this an example of how the growthPredict outputs can correctly be used to capture overall trends in doubling time? Is this only valid for communities using MMv2? Could a similar plot of many bins varying over some parameter be meaningful given the caveats you mention above? Thanks!
Yes - though those changes with depth are in a large part driven by changes in temperature. This is forthcoming, but I am currently working to redefine the cutoff used by gRodon to be set in terms of CUB rather than doubling time (i.e., the relationship levels off after CUB drops below some threshold). This is because organisms with slow growth due to differences in temperature may still be in a regime where CUB is optimized. These cutoffs will be redefined in a new version of the eukaryotic gRodon preprint that will hopefully be out in the next month or so, but in general as long as CUBHE>0.59 for prokaryotes predictions should be reliable.
Also just to be clear - I strongly recommend shifting to MMv2 for all metagenomic prediction
Thanks again for your replies! To clarify, am I correct in understanding that the regular GrowthPredict mode (and partial depending on completeness) can be used for bins/mags, while MMv2 should be used for whole metagenomes? Thank you and looking forward to the next version released!
Yes - unless you suspect there's contamination in the bins/mags in which case MMv2 is probably a safer option for those as well.
Hi, is there a way we can run gRodon on multiple genomes at once directly from R? I don't think that one is included in the documented example.
Thank you!
Hi there, really think this program is awesome! I have a few small questions. 1) sometimes my protein annotation will return two hits for a open reading frame. Would including both in the input (the "genes" from readDNAStringSet(path_to_genome)) bias the results?
2) I find that modes for metagenome v1 and v2 sometimes return very different results (hours different). Is this to be expected?
3) I see in the warnings, "Estimated doubling time >5 hours. CUB signal saturates at approx. 5 hrs... gRodon may underestimate doubling times above this range. Consider simply reporting as '>5hrs'. (In other words, this microbe definitely grows slowly, but we can't tell you quite how slowly)." Many doubling times for my data are >5 hours. Should they be reported as >5 hours or are the differences between them significant?
Thank you very much! Shea