cerebis / bin3C

Extract metagenome-assembled genomes (MAGs) from metagenomic data using Hi-C.
GNU Affero General Public License v3.0
23 stars 7 forks source link

Coverage analysis for any assembler type #1

Closed cerebis closed 3 years ago

cerebis commented 6 years ago

Currently, bin3C cheats to obtain coverage information by relying on SPAdes contig naming convention.

This is obviously brittle but we only use this information to report it to users per-cluster. Later, it would be interesting to add the step of identifying outliers within clusters and offering them as a per-cluster analysis target. This has been done manually in the literature and outliers are enriched for plasmids (and you would expect other mobile elements or conserved sequence).

With that feature implemented, handling any assembler would be important.

cerebis commented 4 years ago

Added 3 possibilities, "generic", "spades" and "megahit".

cerebis commented 3 years ago

Additionally, users can now supply coverage information as a separate table as a tab-delimted table.

To produce this table, we reccommend using CoverM. In our work, the metabat calculation format is fine (--methods metabat), while users could experiment with other methods at their choosing.

{contig_id}\t{coverage-value}