edgraham / BinSanity

Unsupervised Clustering of Environmental Microbial Assemblies Using Coverage and Affinity Propagation
GNU General Public License v3.0
29 stars 14 forks source link

Final Bins Output #38

Closed cazzlewazzle89 closed 4 years ago

cazzlewazzle89 commented 5 years ago

Hi Elaina,

I'm wondering if you would please explain the final output to me. Particularly the BinSanity-Final-Bins directory. I'm using Binsanity to try to recover MAGs from metagenomic samples and end up with the list of files below. Should I concatenate all the refined .fna files for a single Bin into one file or am I completely misreading the situation?

All the best, Calum

low_completion-refined_0.fna
low_completion-refined_1.fna
low_completion-refined_2.fna
PB010_L_contigs_simplified_Bin-10-refined_0.fna
PB010_L_contigs_simplified_Bin-10-refined_10.fna
PB010_L_contigs_simplified_Bin-10-refined_11.fna
PB010_L_contigs_simplified_Bin-10-refined_12.fna
PB010_L_contigs_simplified_Bin-10-refined_1.fna
PB010_L_contigs_simplified_Bin-10-refined_2.fna
PB010_L_contigs_simplified_Bin-10-refined_3.fna
PB010_L_contigs_simplified_Bin-10-refined_4.fna
PB010_L_contigs_simplified_Bin-10-refined_5.fna
PB010_L_contigs_simplified_Bin-10-refined_6.fna
PB010_L_contigs_simplified_Bin-10-refined_7.fna
PB010_L_contigs_simplified_Bin-10-refined_8.fna
PB010_L_contigs_simplified_Bin-10-refined_9.fna
PB010_L_contigs_simplified_Bin-11-refined_0.fna
PB010_L_contigs_simplified_Bin-11-refined_10.fna
PB010_L_contigs_simplified_Bin-11-refined_11.fna
PB010_L_contigs_simplified_Bin-11-refined_12.fna
PB010_L_contigs_simplified_Bin-11-refined_13.fna
PB010_L_contigs_simplified_Bin-11-refined_14.fna
PB010_L_contigs_simplified_Bin-11-refined_15.fna
PB010_L_contigs_simplified_Bin-11-refined_16.fna
PB010_L_contigs_simplified_Bin-11-refined_17.fna
PB010_L_contigs_simplified_Bin-11-refined_18.fna
PB010_L_contigs_simplified_Bin-11-refined_19.fna
PB010_L_contigs_simplified_Bin-11-refined_1.fna
PB010_L_contigs_simplified_Bin-11-refined_2.fna
PB010_L_contigs_simplified_Bin-11-refined_3.fna
PB010_L_contigs_simplified_Bin-11-refined_4.fna
PB010_L_contigs_simplified_Bin-11-refined_5.fna
PB010_L_contigs_simplified_Bin-11-refined_6.fna
PB010_L_contigs_simplified_Bin-11-refined_7.fna
PB010_L_contigs_simplified_Bin-11-refined_8.fna
PB010_L_contigs_simplified_Bin-11-refined_9.fna
PB010_L_contigs_simplified_Bin-1-refined_0.fna
PB010_L_contigs_simplified_Bin-1-refined_1.fna
PB010_L_contigs_simplified_Bin-1-refined_2.fna
PB010_L_contigs_simplified_Bin-2-refined_0.fna
PB010_L_contigs_simplified_Bin-2-refined_1.fna
PB010_L_contigs_simplified_Bin-2-refined_2.fna
PB010_L_contigs_simplified_Bin-2-refined_3.fna
PB010_L_contigs_simplified_Bin-2-refined_4.fna
PB010_L_contigs_simplified_Bin-2-refined_5.fna
PB010_L_contigs_simplified_Bin-3-refined_0.fna
PB010_L_contigs_simplified_Bin-3-refined_1.fna
PB010_L_contigs_simplified_Bin-3-refined_2.fna
PB010_L_contigs_simplified_Bin-3-refined_3.fna
PB010_L_contigs_simplified_Bin-4-refined_0.fna
PB010_L_contigs_simplified_Bin-4-refined_10.fna
PB010_L_contigs_simplified_Bin-4-refined_11.fna
PB010_L_contigs_simplified_Bin-4-refined_12.fna
PB010_L_contigs_simplified_Bin-4-refined_13.fna
PB010_L_contigs_simplified_Bin-4-refined_14.fna
PB010_L_contigs_simplified_Bin-4-refined_15.fna
PB010_L_contigs_simplified_Bin-4-refined_16.fna
PB010_L_contigs_simplified_Bin-4-refined_1.fna
PB010_L_contigs_simplified_Bin-4-refined_2.fna
PB010_L_contigs_simplified_Bin-4-refined_3.fna
PB010_L_contigs_simplified_Bin-4-refined_4.fna
PB010_L_contigs_simplified_Bin-4-refined_5.fna
PB010_L_contigs_simplified_Bin-4-refined_6.fna
PB010_L_contigs_simplified_Bin-4-refined_7.fna
PB010_L_contigs_simplified_Bin-4-refined_8.fna
PB010_L_contigs_simplified_Bin-4-refined_9.fna
PB010_L_contigs_simplified_Bin-5-refined_0.fna
PB010_L_contigs_simplified_Bin-5-refined_1.fna
PB010_L_contigs_simplified_Bin-5-refined_2.fna
PB010_L_contigs_simplified_Bin-5-refined_3.fna
PB010_L_contigs_simplified_Bin-5-refined_4.fna
PB010_L_contigs_simplified_Bin-5-refined_5.fna
PB010_L_contigs_simplified_Bin-5-refined_6.fna
PB010_L_contigs_simplified_Bin-5-refined_7.fna
PB010_L_contigs_simplified_Bin-6-refined_0.fna
PB010_L_contigs_simplified_Bin-6-refined_10.fna
PB010_L_contigs_simplified_Bin-6-refined_1.fna
PB010_L_contigs_simplified_Bin-6-refined_2.fna
PB010_L_contigs_simplified_Bin-6-refined_3.fna
PB010_L_contigs_simplified_Bin-6-refined_4.fna
PB010_L_contigs_simplified_Bin-6-refined_5.fna
PB010_L_contigs_simplified_Bin-6-refined_6.fna
PB010_L_contigs_simplified_Bin-6-refined_7.fna
PB010_L_contigs_simplified_Bin-6-refined_8.fna
PB010_L_contigs_simplified_Bin-6-refined_9.fna
PB010_L_contigs_simplified_Bin-7-refined_0.fna
PB010_L_contigs_simplified_Bin-7-refined_10.fna
PB010_L_contigs_simplified_Bin-7-refined_11.fna
PB010_L_contigs_simplified_Bin-7-refined_12.fna
PB010_L_contigs_simplified_Bin-7-refined_13.fna
PB010_L_contigs_simplified_Bin-7-refined_14.fna
PB010_L_contigs_simplified_Bin-7-refined_15.fna
PB010_L_contigs_simplified_Bin-7-refined_1.fna
PB010_L_contigs_simplified_Bin-7-refined_2.fna
PB010_L_contigs_simplified_Bin-7-refined_3.fna
PB010_L_contigs_simplified_Bin-7-refined_4.fna
PB010_L_contigs_simplified_Bin-7-refined_5.fna
PB010_L_contigs_simplified_Bin-7-refined_6.fna
PB010_L_contigs_simplified_Bin-7-refined_7.fna
PB010_L_contigs_simplified_Bin-7-refined_8.fna
PB010_L_contigs_simplified_Bin-7-refined_9.fna
PB010_L_contigs_simplified_Bin-9-refined_0.fna
PB010_L_contigs_simplified_Bin-9-refined_10.fna
PB010_L_contigs_simplified_Bin-9-refined_11.fna
PB010_L_contigs_simplified_Bin-9-refined_12.fna
PB010_L_contigs_simplified_Bin-9-refined_13.fna
PB010_L_contigs_simplified_Bin-9-refined_14.fna
PB010_L_contigs_simplified_Bin-9-refined_15.fna
PB010_L_contigs_simplified_Bin-9-refined_16.fna
PB010_L_contigs_simplified_Bin-9-refined_1.fna
PB010_L_contigs_simplified_Bin-9-refined_2.fna
PB010_L_contigs_simplified_Bin-9-refined_3.fna
PB010_L_contigs_simplified_Bin-9-refined_4.fna
PB010_L_contigs_simplified_Bin-9-refined_5.fna
PB010_L_contigs_simplified_Bin-9-refined_6.fna
PB010_L_contigs_simplified_Bin-9-refined_7.fna
PB010_L_contigs_simplified_Bin-9-refined_8.fna
PB010_L_contigs_simplified_Bin-9-refined_9.fna
edgraham commented 4 years ago

The final bins should be output into 'Binsanity-Final-bins'. These are the final bins. All those names are telling you is that following initial clustering all your bins were refined. So for example following intitial coverage based clustering you had 'Bin-9'. CheckM indicated Bin-9 was "contaminated" so during the refinement stage Binsanity used compositional metrics to tease apart the contigs in this initial bin leading to what you see. So Bin-9 was subsequently split into 17 bins with the refinement stage.

So you should NOT be doing any concatenation of bins following the output (unless you have evidence that those contigs belong in a single bin, something like Anvi'o could be used to manually assess bins you think may belong to the same organism, this can happen sometimes when you have a highly heterogeneous population of organisms that say have a shared core genome with a much higher coverage than the accessory genome for each individual in the population).

-Elaina