franciscozorrilla / metaGEM

:gem: An easy-to-use workflow for generating context specific genome-scale metabolic models and predicting metabolic interactions within microbial communities directly from metagenomic data
https://franciscozorrilla.github.io/metaGEM/
MIT License
203 stars 42 forks source link

Is it common that only about 30 MAGs of high quality were obtained from one metagenome sample? #24

Closed hongzhonglu closed 3 years ago

hongzhonglu commented 3 years ago

Dear Francisco, Recently, I have tried to use your pipeline to analyse some metagenome data. For the bin, i found I can only get about 30 MAGs of high quality (completeness > 90% by checkM). So I am wondering do you get similar results or could you get more MAGs of high quality which can be directly used for the model reconstruction? Thanks a lot!

Best, Hongzhong

franciscozorrilla commented 3 years ago

Dear Hongzhong,

That sounds good actually, I generally got similar results from my human gut microbiome samples. If I recall correctly, the largest gut community of GEMs we simulated in the metaGEM paper had around 60 members (all reconstructed from a single sample), but most samples had ~30 GEMs. Of course the results will vary depending on the microbiome environment, sample complexity, sequencing depth, etc.

However, bear in mind that you can also use the medium quality MAGs to generate GEMs for simulation. In the paper (Fig. 2b) we showed that although GEMs from HQ MAGs tend to have more genes than GEMs from MQ MAGs, they show a very similar distribution in the number of reactions and metabolites, suggesting that GEM reconstruction with CarveMe is robust towards genome completion (likely due to it's top-down approach).

image

Hope it helps and let me know if you have further questions!

Best wishes, Francisco

hongzhonglu commented 3 years ago

Dear Francisco, Great thanks for your detailed summary. I am wondering that whether the assembling genomes from metagenome is a good way to get the species information from one sample. As shown in one study, https://www.medrxiv.org/content/10.1101/2020.09.02.20187013v1, "For each patient sample, 16S-derived relative bacterial abundances were provided at different taxonomic levels that included 15 phyla, 28 classes, 38 orders, 71 families and 129 genera", they could obtain more species information from the sample. I think the number of species from metagenome sample is quite important to connect the microbiome composition with their potential function. Maybe we can blast the metagenome onto the reference gut species genome to get the species and abundance information.

Best, Hongzhong

hongzhonglu commented 3 years ago

By the way, this tool is also a nice way to do the taxonomy analysis. https://github.com/biobakery/MetaPhlAn/wiki/MetaPhlAn-3.0 Based on its introduction, the following information can be obtained:

  1. unambiguous taxonomic assignments;
  2. an accurate estimation of organismal relative abundance;
  3. species-level resolution for bacteria, archaea, eukaryotes, and viruses;
  4. strain identification and tracking
  5. orders of magnitude speedups compared to existing methods.
  6. metagenomic strain-level population genomics

Best, Hongzhong

franciscozorrilla commented 3 years ago

Dear Hongzhong,

If you are primarily interested in generating a list of species that are present in a metagenome then you may be better off using tools like mOTUs2, metaphlan, or kraken, which work directly on short read data (e.g. no assembly involved). These short-read-based-tools are generally more sensitive at detecting low abundance species compared to assembly-based approaches like metaGEM, although they offer less resolution at the genome level.

If I recall correctly from memory, for the human gut microbiome samples we mapped the short reads from each sample to their corresponding MAG-ome (i.e. single fasta file of all MAGs generated from a single metagenome) and found that between ~60-80% of reads mapped in each sample. This suggests that, even though we are not recovering hundreds of species per sample, we are capturing the species with the highest abundances.

Indeed, if you look at the distribution of relative abundances across samples you will see that the majority of species that are detected with these short-read-based methods have very low relative abundances (0.1%-0.01%), so they are unlikely to be contributing very much in terms of metabolic interactions.

image

Please let me know if you have further questions or suggestions.

Best wishes, Francisco

hongzhonglu commented 3 years ago

Dear Francisco, Thanks for your further sharing. It is now quite clear for me. There is a trade-off in the species number and the abundance. While I have a target in my study, it maybe nice if I can have about over 100 top species with abundance as the cut-off. On the other hand, if we have average 30 species per sample, I am afraid that some important information will be omitted. In fact, it seems that some species with lower abundance also affect the function.

Best, Hongzhong

franciscozorrilla commented 3 years ago

Dear Hongzhong,

Indeed, low abundance species may undoubtedly play an important role in the microbiome. However, the metabolic fluxes through networks of species with low relative abundance are likely less significant/important than those of higher relative abundance species when studying the metabolism of metagenomes via flux balance analysis based methods such as SMETANA. For example, consider a 3 species system with relative abundances of 0.1%, 49.9%, and 50% respectively; in such a case it is easy to see that the metabolic fluxes through the last two species would likely to be dominating the function/phenotype of the microbiome since those species would have ~500x more biomass compared to the low abundance species. Of course in real life the low abundance species may be dominating the higher abundance species through signaling or secretion of toxins (e.g. Salmonella), but these effects would not necessarily be captured through FBA based methods.

Please also bear in mind that amplicon-based approaches like the one you mentioned (https://www.medrxiv.org/content/10.1101/2020.09.02.20187013v1) necessarily make use of reference genome based models (i.e. AGORA), which fail to capture and model the vast pangenomic variation present within species. In fact we highlight this point in the manuscript by showing pangenome curves for the top 10 most commonly reconstructed models (based on presence/absence of EC numbers in GEMs) in figure 2d. As you can see, the core genomes of these species only account for 40-60% of the diversity found in their pagenomes. Relying on reference based GEMs completely ignores this context-specific variability.

image

As a final comment, I wanted to mentioned that in the upcoming revision of the manuscript we show that many of the predicted metabolic interactions in the IGT/T2D communities are well documented in the literature, suggesting that the reconstructed communities of high abundance species can be used to successfully model the phenotype of gut microbiomes.

Best wishes, Francisco

hongzhonglu commented 3 years ago

Thanks a lot! Very nice job! Looking forward to your new version of paper in metaGEM.

franciscozorrilla commented 3 years ago

I forgot to ask, how did you carry out the binning? You can get more/higher quality MAGs by using more samples (~100) and cross mapping each set of paired reads to each assembly for CONCOCT and then using metaWRAP for refining and reassembly as shown in this figure here.

hongzhonglu commented 3 years ago

Now I only test vamb (https://github.com/RasmussenLab/vamb) using one sample. So here some MAGs you mentioned may only exist in some samples even the total number of high quality MAGs is higher from more samples?

franciscozorrilla commented 3 years ago

Although it is a bit lengthy, I think that this discussion does a good job at explaining why using more samples can help you get better MAGs even if they are coming from a single sample. It is a counter-intuitive concept, but contig coverage across samples gives CONCOCT more information for binning contigs in a single sample.

I have not tried out vamb myself but I was very interested in testing it and perhaps integrating it into metaGEM. Have you compared vamb to the binners used by metaGEM?

hongzhonglu commented 3 years ago

It is really nice discussion with you. Currently, I did not compare vamb to the binners used in metaGEM as I want to find a simple procedure (or a short pipeline) to do the bin step at the start. I plan to do comparison later when I am free.

franciscozorrilla commented 3 years ago

I see, unfortunately there is no easy answer as I do not think that there is a golden standard for binning MAGs. In this twitter thread you can see that there are many differing opinions regarding what is the best binning software/procedures.

Btw, did you see the tutorial? Using two samples and the entire metaGEM binning workflow I got 135 MAGs with high completeness and low contamination.

hongzhonglu commented 3 years ago

Thanks for your sharing! I see your nice tutorial. By the way, how do you think of strain profiling based on MetaPhlAn 3.0 and mOTUs_v2? As an example, with mOTUs_v2, I can find much more annotated species. The mOTUs could also calculated relative abundances of each species. I am considering to utilize these tools together with bin strategy to overcome the limited species genomes from the current bin strategy.

franciscozorrilla commented 3 years ago

Hi Hongzhong, sorry for the late response! I have not personally tried metaphlan3 myself, so I cannot give any insights regarding how the perfomance compares to motus2. However, I think it is good complementary strategy to use short read based methods for strains/genomes that are too low abundance for MAG reconstruction.

hongzhonglu commented 3 years ago

Thanks Francisco!I check your tutorial and find concoct performs better than maxbin2 in your case. However, when I check maxbin2 paper (https://academic.oup.com/bioinformatics/article/32/4/605/1744462), it shows maxbin2 better than concoct. Is there anything I misunderstood?

franciscozorrilla commented 3 years ago

Hi Hongzhong, I think that the paper that you are referencing is very sneaky at presenting results 😈

Here are some of my thoughts:

  1. The results shown in Figure 1 are based on simulated metagenomes of 100 species, not on real samples. In fact, they don't ever actually compare the performances of the binners on real samples! image

  2. MaxBin2 appears to be specifically designed and tested for binning co-assembled contigs as shown in sup fig 1 of the paper you cited. While there are situations where one could argue the merits of co-assembling samples (e.g. analyzing longitudinal patient samples or biological replicates), I do not think that coassembly is appropriate in the context of predicting metabolic interactions from individual human gut metagenomes (e.g. where 1 paired end sample = 1 metagenome). This is because you may end up generating chimeric contigs and MAGs that do not reflect actual biological entities. image

  3. When it comes to single sample assemblies MaxBin2 does not work as well as shown in sup fig 4. Note that also here they conveniently omit reporting the number of bins generated by CONCOCT and MetaBAT on the single sample asssemblies :) image

I think that the authors were aware of what they were doing, and that is why they use very careful language when comparing the performance of MaxBin2 to the other binners:

Benchmarking the tools using different minimum contig length settings (500 and 1000 bps) revealed that MaxBin performed relatively well (in terms of F-score, which is the harmonic mean of precision and recall) compared to other binning tools (Fig. 1). It was also ranked first in tests involving 20 or more samples, indicating its accuracy in classifying contigs into distinct genomes.

Again, in the supplementary materials they confess that indeed CONCOCT had higher recall even on the simulated dataset:

By looking into precision and recall separately we found that MaxBin 2.0 achieved the 149 highest precision while CONCOCT had the highest recall, as shown in Figure S2

image

From my personal experience throughout the development of metaGEM, I have found that MaxBin tends to be outperformed by both MetaBAT2 and CONCOCT, although the results will depends on the dataset of course.

hongzhonglu commented 3 years ago

Hi Francisco, the comparison of different tools is a little confused as I see in MetaBAT2 paper https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6662567/, they show MetaBAT2 come first while MaxBin2 come second in most cases. But anyway we should believe in what we can get.😀

franciscozorrilla commented 3 years ago

Yes, unfortunately each paper claims that their binner is superior to the state of the art in some way. The CAMI challenge papers seem to be the most unbiased and objective benchmark, here is the latest paper. In summary I think each tool has strengths and weaknesses, and this is why using multiple binners + dereplication/refinement strategies is common in state of the art papers like this one, which follows a MAG reconstruction protocol that is very similar to metaGEM.

hongzhonglu commented 3 years ago

I just got the result using vamb, maxbin2 and metabat2 in my calculation only using one sample as input. If using the cut-off, completness >=90%, contamination <=5%, the number of MAGs from vamb, maxbin2 and metabat2 is 26, 24 and 20 respectively.

franciscozorrilla commented 3 years ago

Thanks for sharing your results, that is very interesting. Did you try using coverage across multiple samples for binning? I believe all of these tools are benchmarked using contig coverage across multiple samples for binning to increase performance.

Also have you thought about comparing results with CONCOCT?

hongzhonglu commented 3 years ago

Currently I am not try using coverage across multiple samples for binning. Later I can check it.

franciscozorrilla commented 3 years ago

I am a bit surprised by the choice of excluding CONCOCT. Both papers cited in the post above are only a few weeks old and from leaders in the field, they use CONCOCT. Also from the CAMI paper: "Completeness was high for all methods and was highest for CONCOCT."

Yes, unfortunately each paper claims that their binner is superior to the state of the art in some way. The CAMI challenge papers seem to be the most unbiased and objective benchmark, here is the latest paper. In summary I think each tool has strengths and weaknesses, and this is why using multiple binners + dereplication/refinement strategies is common in state of the art papers like this one, which follows a MAG reconstruction protocol that is very similar to metaGEM.

I am surprised that they didn't compare against CONCOCT in the vamb paper.

hongzhonglu commented 3 years ago

Hi, I just want to make life easier😁. If one method is enough, I prefer to use only one method. As you said, CONCOCT is very valuable toolbox to be used. I agree with your ideas.