nf-core / mag

Assembly and binning of metagenomes
https://nf-co.re/mag
MIT License
216 stars 110 forks source link

Add dereplication with dRep #413

Open erikrikarddaniel opened 1 year ago

erikrikarddaniel commented 1 year ago

Description of feature

dRep takes a set of genomes, with CheckM data, and dereplicates them to produce a set of non-overlapping genomes at a specified ANI. As this is basically just pointing to certain MAGs as the representatives of clusters, the output could possibly be summarised by a column in the bin summary table: dRep. If the column says e.g. 95 and hence indicates the ANI, one could potentially run dRep multiple times. There is no existing nf-core module.

jfy133 commented 1 year ago

I also know of: https://github.com/wwood/galah that might do something similar

prototaxites commented 1 year ago

I have a module for Galah that I wrote for a personal pipeline processing the output of mag, as well as a process that takes the busco_summary.tsv file and converts it to the format required by Galah to use the completeness/contamination information.

I had planned to add them to mag at some point, but haven't found time yet.

jfy133 commented 1 year ago

:tada: awesome!

prototaxites commented 1 year ago

https://github.com/nf-core/modules/pull/3666

jfy133 commented 6 months ago

Requires: