Open franciscozorrilla opened 3 years ago
Hi! I was wondering whether you currently have any plans for integrating new binners, or if any binner seemed particularly promising to you. I haven't tried running Vamb, but it seems like a good candidate for metaGEM, because it is under active development, already has a Snakemake workflow, and can leverage the jgi_summarize_bam_contig_depths
approach that metaGEM already uses. Curious if you have thoughts on this!
Thanks, Zoey
Hey Zoey!
Thanks for commenting, indeed Vamb has definitely been on my radar. However, based on the TL;DR section of the readme, it seems like there may need to be some tweaking/benchmarking involved due to the fact their recommended workflow is to concatenate assemblies before mapping, whereas the metaGEM implementation is to cross-map each individual assembly. I do see that they provide a Snakefile that may be a good starting point for implementing vamb in the metaGEM workflow. The CAMI2 paper does not shine a very flattering light on Vamb, although they have outlined a response here and suggested that best practices were not followed. In other news, semibin was recently published with peer review, and it looks like it does quite well compared to the other binners considered in their paper. Unfortunately semibin did not make it into the CAMI2 paper to get a 3rd party evaluation, and the documentation suggests that assemblies also need to be concatenated before mapping. Based on all this, I would probably start with semibin but vamb also seems worth trying out. Perhaps the testing/benchmarking process is something that could be facilitated with toolchest if @lebovic & co have these binners implemented on their infrastructure?
Unfortunately adding/testing new binners is not very high on my priority list due to time/resource constraints and other ongoing projects. I have been thinking about applying for some funding to help maintain and update metaGEM with the latest tools, e.g. the Chan-Zuckerberg Essential Open Source Software for Science. For now I am more focused on adding support for the reconstruction of single amplified genomes (SAGs), as well as long read sequencing compatibility.
If you do end up trying some of these binners please let me know how they compare to those already implemented in metaGEM 💎
Best wishes, Francisco
Thanks for the mention, @franciscozorrilla!
We don't have those binners implemented yet, but let me know if you'd like us to add them @zoey-rw
Some more new binners to test when time allows, probably a good idea to get around to this in 2024 🤞
The tools you are using are obsolete. Many recent alternatives (ComeBin, SemiBin2, MetaDecoder etc.) will probably give better results.
— Florian Plaza Oñate (@fplazaonate) January 7, 2024
The binning landscape has changed since the initial development of metaGEM. It would be a good idea to get a shortlist of novel binning tools for testing, with the ultimate goal of adding and/or replacing binners. E.g. semibin