BinPro / CONCOCT

Clustering cONtigs with COverage and ComposiTion
Other
122 stars 48 forks source link

Need for cutting contigs in pieces? #283

Closed jtamames closed 4 years ago

jtamames commented 4 years ago

Hello First at all, congrats for your excellent work! I wanted to ask you for the necessity of cutting contigs in pieces. This is because I already have coverage values for the contigs I want to bin with concoct. That is, I could easily generate the coverage table, not needing to work with the bam files, and thus saving a substantial amount of time. But my coverage table will have values for the full, uncut contigs. Then I would like to ask you, is chopping contigs in pieces really needed? Do you do it for a simple performance issue, or is there an algorithmic reason for doing it?

Thanks a lot! Javier

franciscozorrilla commented 4 years ago

Hello Javier,

Have you read issue #173? Perhaps it doesn't explicitly answer your question about whether it is strictly necessary to cut up contigs, but it does give some motivation behind why they do it.

Perhaps you would also be interested in reading issue #286, where you will find that I was encountering errors with running concoct if my coverage table did not have cut up contigs.

You mention that you can easily generate a coverage table without bam files, are you using kallisto or some other aligner? Then you may be interested in this script. If you aren't using an aligner or mapper then I am curious, how are you generating the input tables?

Hope some of this helps!

Best, Francisco

jtamames commented 4 years ago

Thank you Francisco We are integrating concoct in our SqueezeMeta platform (https://github.com/jtamames/SqueezeMeta). SqueezeMeta already maps reads back to contigs, generating coverage tables that we use to run Maxbin and Metabat2. Best, Javier

franciscozorrilla commented 4 years ago

Ah yes I've come across the SqueezeMeta paper before, very nice work! In the second post of the aforementioned issue (#286) I describe how I managed to get concoct coverage tables (from sorted bam files that were mapped against the original contigs) in terms of the cut up contigs. If I understand correctly, you only need to cut up the contigs to generate a bedfile to provide as input to concoct_coverage_table.py (and for running concoct), but the mapping can be against the original contigs :) Hope it helps and let me know if anything is unclear. Saludos! Francisco

jtamames commented 4 years ago

Thanks a lot Francisco I will take a look and come back to you if I need help. Best, Javier